Assembly and screening of highly complex and fully human antibody repertoire in yeast

ABSTRACT

Compositions, methods, and kits are provided for efficiently generating and screening a library of highly diverse protein complexes for their ability to bind to other proteins or oligonucleotide sequences. In one aspect of the invention, a library of expression vectors is provided for expressing the library of protein complexes. The library comprises a first nucleotide sequence encoding a first polypeptide subunit; and a second nucleotide sequence encoding a second polypeptide subunit. The first and second nucleotide sequences each independently vary within the library of expression vectors. In addition, the first and second polypeptide subunit are expressed as separate proteins which self-assemble to form a protein complex, such as a double-chain antibody fragment (dcFv or Fab) and a fully assembled antibody, in cells into which the library of expression vectors are introduced. The library of expression vectors can be efficiently generated in yeast cells through homologous recombination; and the encoded proteins complexes with high binding affinity to their target molecule can be selected by high throughput screening in vivo or in vitro.

CROSS REFERNCE TO RELATED APPLICATION

[0001] This application is a divisional of U.S. application Ser. No.09/703,399, filed Oct. 31, 2000, entitled “Assembly And Screening OfHighly Complex And Fully Human Antibody Repertoire In Yeast.” Thisapplication is hereby incorporated by reference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] This invention relates to compositions, methods and kits forgenerating libraries of recombinant expression vectors and using theselibraries in screening of affinity-binding pairs, and, moreparticularly, for generating libraries of recombinant human antibodiesand screening for their affinity binding with target antigens.

[0004] 2. Description of Related Art

[0005] Antibodies are a diverse class of molecules. Delves, P. J. (1997)“Antibody production: essential techniques”, New York, John Wiley &Sons, pp. 90-113. It is estimated that even in the absence of antigenstimulation a human makes at least 10¹⁵ different antibody molecules—itsPermian antibody repertoire. The antigen-binding sites of manyantibodies can cross-react with a variety of related but differentantigenic determinants, and the Permian repertoire is apparently largeenough to ensure that there will be an antigen-binding site to fitalmost any potential antigenic determinant, albeit with low affinity.

[0006] Structurally, antibodies or immunoglobulins (Igs) are composed ofone or more Y-shaped units. For example, immunoglobulin G (IgG) has amolecular weight of 150 kDa and consists of just one of these units.Typically, an antibody can be proteolytically cleaved by the proteinasepapain into two identical Fab (fragment antigen binding) fragments andone Fc (fragment crystallizable) fragment. Each Fab contains one bindingsite for antigen, and the Fc portion of the antibodies mediates otheraspects of the immune response.

[0007] A typical antibody contains four polypeptides—two identicalcopies of a heavy (H) chain and two copies of a light (L) chain, forminga general formula H₂L₂. Each L chain is attached to one H chain by adisulfide bond. The two H chains are also attached to each other bydisulfide bonds. Papain cleaves N-terminal to the disulfide bonds thathold the H chains together. Each of the resulting Fabs consists of anentire L chain plus the N-terminal half of an H chain; the Fc iscomposed of the C-terminal halves of two H chains. Pepsin cleaves atnumerous sites C-terminal to the inter-H disulfide bonds, resulting inthe formation of a divalent fragment [F(ab′)] and many small fragmentsof the Fc portion. IgG heavy chains contain one N-terminal variable(V_(H)) plus three C-terminal constant (C_(H)1, C_(H)2 and C_(H)3)regions. Light chains contain one N-terminal variable (V_(L)) and oneC-terminal constant (C_(L)) region each. The different variable andconstant regions of either heavy or light chains are of roughly equallength (about 110 amino residues per region). Fabs consist of one V_(L),V_(H), C_(H)1, and C_(L) region each. The V_(L) and V_(H) portionscontain hypervariable segments (complementarity-determining regions orCDR) that form the antibody combining site.

[0008] The V_(L) and V_(H) portions of a monoclonal antibody have alsobeen linked by a synthetic linker to form a single chain protein (scFv)which retains the same specificity and affinity for the antigen as themonoclonal antibody itself. Bird, R. E., et al. (1988) “Single-chainantigen-binding proteins” Science 242:423-426. A typical scFv is arecombinant polypeptide composed of a V_(L) tethered to a V_(H) by adesigned peptide, such as (Gly₄-Ser)₃, that links the carboxyl terminusof the V_(L) to the amino terminus of the V_(H) sequence. Theconstruction of the DNA sequence encoding a scFv can be achieved byusing a universal primer encoding the (Gly₄-Ser)₃ linker by polymerasechain reactions (PCR). Lake, D. F., et al. (1995) “Generation of diversesingle-chain proteins using a universal (Gly₄-Ser)₃ encodingoligonucleotide” Biotechniques 19:700-702.

[0009] The mammalian immune system has evolved unique genetic mechanismsthat enable it to generate an almost unlimited number of different lightand heavy chains in a remarkably economical way by joining separate genesegments together before they are transcribed. For each type of Igchain—κ light chains, λ light chains, and heavy chain—there is aseparate pool of gene segments from which a single peptide chain iseventually synthesized. Each pool is on a different chromosome andusually contains a large number of gene segments encoding the V regionof an Ig chain and a smaller number of gene segments encoding the Cregion. During B cell development a complete coding sequence for each ofthe two Ig chains to be synthesized is assembled by site-specificgenetic recombination, bringing together the entire coding sequences fora V region and the coding sequence for a C region. In addition, the Vregion of a light chain is encoded by a DNA sequence assembled from twogene segments—a V gene segment and short joining or J gene segment. TheV region of a heavy chain is encoded by a DNA sequence assembled fromthree gene segments—a V gene segment, a J gene segment and a diversityor D segment.

[0010] The large number of inherited V, J and D gene segments availablefor encoding Ig chains makes a substantial contribution on its own toantibody diversity, but the combinatorial joining of these segmentsgreatly increases this contribution. Further, imprecise joining of genesegments and somatic mutations introduced during the V-D-J segmentjoining at the pre-B cell stage greatly increases the diversity of the Vregions.

[0011] After immunization against an antigen, a mammal goes through aprocess known as affinity maturation to produce antibodies with higheraffinity toward the antigen. Such antigen-driven somatic hypermutationfine-tunes antibody responses to a given antigen, presumably due to theaccumulation of point mutations specifically in both heavy-andlight-chain V region coding sequences and a selected expansion ofhigh-affinity antibody-bearing B cell clones.

[0012] Great efforts have been made to mimic such a natural maturationof antibodies against various antigens, especially antigens associatedwith diseases such as autoimmune diseases, cancer, AIDS and asthma. Inparticular, phage display technology has been used extensively togenerate large libraries of antibody fragments by exploiting thecapability of bacteriophage to express and display biologicallyfunctional protein molecule on its surface. Combinatorial libraries ofantibodies have been generated in bacteriophage lambda expressionsystems which may be screened as bacteriophage plaques or as colonies oflysogens (Huse et al. (1989) Science 246: 1275; Caton and Koprowski(1990) Proc. Natl. Acad. Sci. (U.S.A.) 87: 6450; Mullinax et al (1990)Proc. Natl. Acad. Sci. (U.S.A.) 87: 8095; Persson et al. (1991) Proc.Natl. Acad. Sci. (U.S.A.) 88: 2432). Various embodiments ofbacteriophage antibody display libraries and lambda phage expressionlibraries have been described (Kang et al. (1991) Proc. Natl. Acad. Sci.(U.S.A.) 88: 4363; Clackson et al. (1991) Nature 352: 624; McCafferty etal. (1990) Nature 348: 552; Burton et al. (1991) Proc. Natl. Acad. Sci.(U.S.A.) 88: 10134; Hoogenboom et al. (1991) Nucleic Acids Res. 19:4133; Chang et al. (1991) J. Immunol. 147: 3610; Breitling et al. (1991)Gene 104: 147; Marks et al. (1991) J. Mol. Biol. 222: 581; Barbas et al.(1992) Proc. Natl. Acad. Sci. (U.S.A.) 89: 4457; Hawkins and Winter(1992) J. Immunol. 22: 867; Marks et al. (1992) Biotechnology 10: 779;Marks et al. (1992) J. Biol. Chem. 267: 16007; Lowman et al (1991)Biochemistry 30: 10832; Lerner et al. (1992) Science 258: 1313). Alsosee review by Rader, C. and Barbas, C. F. (1997) “Phage display ofcombinatorial antibody libraries” Curr. Opin. Biotechnol. 8:503-508.

[0013] Various scFv libraries displayed on bacteriophage coat proteinshave been described. Marks et al. (1992) Biotechnology 10: 779; Winter Gand Milstein C (1991) Nature 349: 293; Clackson et al. (1991) op.cit.;Marks et al. (1991) J. Mol. Biol. 222: 581; Chaudhary et al. (1990)Proc. Natl. Acad. Sci. (USA) 87:1066; Chiswell et al. (1992) TIBTECH 10:80; and Huston et al. (1988) Proc. Natl. Acad. Sci. (USA) 85: 5879.

[0014] Generally, a phage library is created by inserting a library of arandom oligonucleotide or a cDNA library encoding antibody fragment suchas V_(L) and V_(H) into gene 3 of M13 or fd phage. Each inserted gene isexpressed at the N-terminal of the gene 3 product, a minor coat proteinof the phage. As a result, peptide libraries that contain diversepeptides can be constructed. The phage library is then affinity screenedagainst immobilized target molecule of interest, such as an antigen, andspecifically bound phages are recovered and amplified by infection intoEscherichia coli host cells. Typically, the target molecule of interestsuch as a receptor (e.g., polypeptide, carbohydrate, glycoprotein,nucleic acid) is immobilized by covalent linkage to a chromatographyresin to enrich for reactive phage by affinity chromatography) and/orlabeled for screen plaques or colony lifts. This procedure is calledbiopanning. Finally, amplified phages can be sequenced for deduction ofthe specific peptide sequences. During the inherent nature of phagedisplay, the antibodies displayed on the surface of the phage may notadopt its native conformation under such in vitro selection conditionsas in a mammalian system. In addition, bacteria do not readily process,assemble, or express/secrete functional antibodies.

[0015] Transgenic animals such as mice have been used to generate fullyhuman antibodies by using the XENOMOUSE™ technology developed bycompanies such as Abgenix, Inc., Fremont, Calif. and Medarex, Inc.Annandale, N.J. Strains of mice are engineered by suppressing mouseantibody gene expression and functionally replacing it with humanantibody gene expression. This technology utilizes the natural power ofthe mouse immune system in surveillance and affinity maturation toproduce a broad repertoire of high affinity antibodies. However, thebreeding of such strains of transgenic mice and selection of highaffinity antibodies can take a long period of time. Further, the antigenagainst which the pool of the human antibody is selected has to berecognized by the mouse as a foreign antigen in order to mount immuneresponse; antibodies against a target antigen that does not haveimmunogenicity in a mouse may not be able selected by using thistechnology. In addition, there may be a regulatory issue regarding theuse of transgenic animals, such as transgenic goats (developed byGenzyme Transgenics, Framingham, Mass.) and chickens (developed byGeneworks, Inc., Ann Arbor, Mich.), to produce antibody, as well assafety issues concerning containment of transgenic animals infected withrecombinant viral vectors.

[0016] Antibodies and antibody fragments have also been produced intransgenic plants. Plants, such as corn plants (developed by IntegratedProtein Technologies, St. Louis, Mo.), are transformed with vectorscarrying antibody genes, which results in stable integration of theseforeign genes into the plant genome. In comparison, most microorganismstransformed with plasmids can lose the plasmids during a prolongedfermentation. Transgenenic plant may be used as a cheaper means toproduce antibody in large scales. However, due to the long growthcircles of plants screening for antibody with high binding affinitytoward a target antigen may not be efficient and feasible for highthroughput screening in plants.

SUMMARY OF THE INVENTION

[0017] The present invention provides compositions, methods, and kitsfor efficiently generating and screening protein complexes for theirability to bind to other proteins or oligonucleotide sequences. Onefeature of the present invention is the production of two or morepolypeptides which self-assemble to form a protein complex in vivo. Thein vivo formed protein complex is then tested in the same in vivo systemfor the complex's ability to bind to either a protein or a nucleotidesequence (DNA or RNA). The ability to express polypeptides, form proteincomplexes of those polypeptides, and screen the protein complexes all inthe same intracellular system enables the present invention to screenlarge populations of protein complexes for binding with high throughput.

[0018] In one aspect of the present invention, compositions areprovided. These compositions may be used for screening affinity-bindingpairs between a tester protein complex and a target molecule in vitro orin vivo. The target molecule may be a protein, peptide, DNA, RNA, orsmall molecules.

[0019] In one embodiment, a library of yeast expression vectors isprovided which express the protein complex to be screened. The yeastexpression vectors forming the library comprise a first nucleotidesequence encoding a first polypeptide subunit; and a second nucleotidesequence encoding a second polypeptide subunit, the first and secondnucleotide sequences each independently varying within the library ofexpression vectors.

[0020] According to this embodiment, the first polypeptide subunit andthe second polypeptide subunit can be expressed as separate proteins orpeptides. This may be accomplished by expressing the first and secondpolypeptide subunits from separate promoters, or by expressing thepolypeptide subunits bicistronically from the same promoter via aninternal ribosomal entry site (IRES) or via a splicing donor-acceptormechanism.

[0021] Also according to the embodiment, the yeast expression vector maybe a 2μ plasmid or a yc-type (centromeric) vector, preferably ayeast-bacterial shuttle vector which contains a bacterial origin ofreplication.

[0022] Also according to the embodiment, the first polypeptide subunitand/or the second polypeptide can be expressed as a fusion protein witha cell wall/membrane protein, such as the yeast agglutinin cell wallprotein. Such a fusion allows transportation of the protein complex(e.g. antibody) formed between the first and second subunits to the cellwall/membrane, thus effectively mimicking the cell surface display ofantibodies by B cells in the immune system for affinity maturation invivo.

[0023] Alternatively, the first polypeptide subunit or the secondpolypeptide can be expressed as a fusion protein with nucleus protein,such as the nucleus transportation domain of a transcription factor.Such a fusion allows transportation of the protein complex (e.g.antibody) formed between the first and second subunits to the nucleuswhere interaction of the antibody with nuclear target(s) occurs.

[0024] In another embodiment, a library of expression vectors isprovided. The expression vectors forming in the library comprise: atranscription sequence encoding an activation domain or a DNA bindingdomain of a transcription activator; a first nucleotide sequenceencoding a first polypeptide subunit; and a second nucleotide sequenceencoding a second polypeptide subunit, the first and second nucleotidesequence each independently varying within the library of expressionvectors.

[0025] The activation domain or the DNA binding domain of thetranscription activator and the first polypeptide subunit are expressedas a single fusion protein. The second polypeptide subunit is expressedas a separate protein or peptide from the first polypeptide.

[0026] According to this embodiment, the expression vector may be abacterial, phage, yeast, mammalian and viral expression vector,preferably a yeast expression vector, and more preferably a 2μ plasmidyeast expression vector.

[0027] Also according to this embodiment, the transcription activatorsequence may be located 5′ relative to the first nucleotide sequence.Alternatively, the transcription activator sequence may be located 3′relative to the first nucleotide sequence.

[0028] In yet another embodiment, a library of transformed yeast cellsis provided. The library of yeast cells comprises a library of yeastexpression vectors. The expression vectors in the library of transformedyeast cells comprise: a transcription sequence encoding an activationdomain or a DNA binding domain of a transcription activator; a firstnucleotide sequence encoding a first polypeptide subunit; and a secondnucleotide sequence encoding a second polypeptide subunit, the first andsecond nucleotide sequence each independently varying within the libraryof expression vectors. The activation domain or the DNA binding domainof the transcription activator and the first polypeptide subunit areexpressed as a single fusion protein. The second polypeptide subunit isexpressed as a separate protein or peptide from the first polypeptide.

[0029] According to this embodiment, the yeast cells may be diploidyeast cells. Alternatively, the yeast cells may be haploids such as thea and α strain of yeast haploid cells.

[0030] In another aspect of the present invention, methods are providedfor generating a library of yeast expression vectors that may be usedfor screening protein-protein or protein-DNA binding pairs.

[0031] In one embodiment, the method comprises: transforming into yeastcells a library of insert nucleotide sequences that are linear anddouble-stranded, and a library of linearized yeast expression vectors,each having a 5′- and 3′-terminus sequence at the site of linearization.

[0032] The linearized yeast expression vectors of the vector librarycomprise a first polynucleotide sequence encoding a first polypeptidesubunit which varies within the vector library. The insert sequences ofthe insert library comprise a second nucleotide sequence encoding asecond polypeptide subunit which varies within the insert library. Eachof the insert sequences also comprises a 5′- and 3′-flanking sequence atthe respective ends of the insert sequence. The 5′- and 3′-flankingsequences of the insert sequence are sufficiently homologous to the 5′-and 3′-terminus sequences of the linearized yeast expression vector,respectively, to enable homologous recombination to occur.

[0033] Homologous recombination occurring between the vector and theinsert sequence results in inclusion of the insert sequence into thevector in the transformed yeast cells. Since the first and secondnucleotide sequences vary independently within the insert library(having a complexity of 10^(x)) and vector library (having a complexityof 10^(y)), respectively, the complexity of the library formed as aresult of homologous recombination should theoretically be 10^(x+y).

[0034] In this embodiment, the first polypeptide subunit and the secondpolypeptide subunit are expressed as separate proteins or peptides. Thismay be accomplished by expressing the first and second polypeptidesubunits from separate promoters on the vector, or by expressing thepolypeptide subunits bicistronically from the same promoter on thevector via an internal ribosomal entry site (IRES) or via a splicingdonor-acceptor mechanism.

[0035] According to the embodiment, the 5′- and 3′-flanking sequences ofthe insert sequence is preferably between about 30-120 bp in length,more preferably between about 40-90 bp in length, and most preferablybetween about 45-55 bp in length.

[0036] According to the embodiment, the vector library comprising thesecond nucleotide sequences may be constructed by directional cloning ofa library of the second nucleotide sequence inserts into a yeastexpression vector in bacteria. Alternatively, the vector library may beconstructed by inserting a library of the second nucleotide sequenceinserts into a yeast expression vector via homologous recombination inyeast. Homologous recombination in yeast is preferred due to its highertransformation efficiency.

[0037] In yet another aspect of the present invention, methods areprovided for selecting tester protein complexes capable of binding to atarget peptide, protein, or DNA.

[0038] In an embodiment where the target molecule is a target peptide orprotein, the method comprises:

[0039] expressing a library of tester protein complexes in yeast cells,each tester protein complex being formed between a first polypeptidesubunit whose sequence varies within the library, and a secondpolypeptide subunit whose sequence varies within the libraryindependently of the first polypeptide; expressing one or more targetfusion proteins in the yeast cells expressing the tester proteins, eachof the target fusion proteins comprising a target peptide or protein;and

[0040] selecting those yeast cells in which a reporter gene isexpressed, the expression of the reporter gene being activated bybinding of the tester protein complex to the target fusion protein.

[0041] According to this embodiment, expression of the reporter gene maybe activated by a functional transcription activator being formed by thebinding of the tester protein complex to the target peptide or proteinas in a yeast two-hybrid system.

[0042] In a variation of the embodiment employing the yeast two-hybridsystem, the tester protein forms a portion of a fusion protein witheither a DNA binding domain or an activation domain of a transcriptionalactivator. The target protein meanwhile forms a portion of a fusionprotein comprising the DNA binding domain or the activation domain ofthe transcriptional activator which is not present in the fusion proteincomprising the tester protein. If the tester protein is able to bind tothe target protein, a functional transcriptional activator is formed.

[0043] According to this variation, the step of expressing the libraryof tester protein complexes may include transforming a library of testerexpression vectors into the yeast cells which contain a reporterconstruct comprising the reporter gene whose expression is undertranscriptional control of a transcription activator comprising anactivation domain and a DNA binding domain.

[0044] Each of the tester expression vectors comprises a firsttranscription sequence encoding either the activation domain or the DNAbinding domain of the transcription activator, a first nucleotidesequence encoding the first polypeptide subunit, and a second nucleotidesequence encoding the second polypeptide subunit, the first and secondnucleotide sequences varying independently within the library of testerexpression vectors. The domain encoded by the first transcriptionsequence and the first polypeptide subunit are expressed as a fusionprotein. The first and second polypeptide subunits are expressed asseparate proteins, and form the tester protein complex upon binding witheach other through non-covalent interactions (e.g. hydrophobicinteractions) or covalent interactions (e.g. disulfide bonds).

[0045] Optionally, the step of expressing the target fusion proteinsincludes transforming a target expression vector into the yeast cellssimultaneously or sequentially with the library of tester expressionvectors. The target expression vector comprises a second transcriptionsequence encoding either the activation domain or the DNA binding domainof the transcription activator which is not expressed by the library oftester expression vectors; and a target sequence encoding the targetprotein or peptide.

[0046] In another variation of the embodiment involving the yeasttwo-hybrid system, the steps of expressing the library of tester proteincomplexes and expressing the target fusion protein includes causingmating between first and second populations of haploid yeast cells ofopposite mating types.

[0047] The first population of haploid yeast cells comprises a libraryof tester expression vectors for the library of tester fusion proteins.Each of the tester expression vector comprises a first transcriptionsequence encoding either the activation domain or the DNA binding domainof the transcription activator, a first nucleotide sequence encoding thefirst polypeptide subunit, and a second nucleotide sequence encoding thesecond polypeptide subunit, the first and second nucleotide sequencesvarying independently within the library of tester expression vectors.The domain encoded by the first transcription sequence and the firstpolypeptide subunit are expressed as a fusion protein. The first andsecond polypeptide subunits are expressed as separate proteins, and formthe tester protein complex upon binding with each other throughnon-covalent interactions (e.g. hydrophobic interactions) or covalentinteractions (e.g. disulfide bonds).

[0048] The second population of haploid yeast cells comprises a targetexpression vector. The target expression vector comprises a secondtranscription sequence encoding either the activation domain or the DNAbinding domain of the transcription activator which is not expressed bythe library of tester expression vectors; and a target sequence encodingthe target protein or peptide.

[0049] Either the first or second population of haploid yeast cellscomprises a reporter construct comprising the reporter gene whoseexpression is under transcriptional control of the transcriptionactivator.

[0050] In this variation, the haploid yeast cells of opposite matingtypes may preferably be α and a type strains of yeast. The matingbetween the first and second populations of haploid yeast cells of α anda type strains may be conducted in a rich nutritional culture medium.

[0051] Optionally, a plurality of target fusion proteins may beexpressed and screened against the library of tester proteins at thesame time. According to this variation, the population of haploid yeastcells comprising the expression vector encoding a target proteincomprises a plurality of expression vectors encoding a plurality oftarget proteins. Each target protein forms a portion of a fusion proteinwhich also comprises either an activation domain or a DNA bindingdomain.

[0052] According to this variation, members of the library of testerexpression vectors may be arrayed as individual yeast clones in one ormore multiple-well plates.

[0053] Also according to this variation, the plurality of the targetexpression vectors may be arrayed as individual yeast clones in one ormore multiple-well plates.

[0054] Also according to this variation, mating may be based on clonalmating in which each yeast clone containing a members of the testerexpression vectors is mated individually with each of the plurality oftarget expression vectors.

[0055] Also according to this variation, the plurality of the targetexpression vectors may be a library of expression vectors containing acollection of human EST clones or a collection of domain structures.

[0056] According to any of the above-described methods for selectingprotein-protein binding pairs, the target fusion protein comprises anantigen associated with a disease state such as a tumor-surface antigen.Optionally, the target fusion protein may comprise a human growth factorreceptor such as epidermal growth factors, transferrin, insulin-likegrowth factor, transforming growth factors, interleukin-1, andinterleukin-2.

[0057] In another embodiment, a method is provided for screeningprotein-DNA binding pairs in a yeast one-hybrid system. The methodcomprises: expressing a library of tester protein complexes in yeastcells which contain a reporter construct comprising a reporter genewhose expression is under a transcriptional control of a target DNAsequence; and selecting the yeast cells in which the reporter gene isexpressed, the expression of the reporter gene being activated bybinding of the tester protein complex to the target DNA sequence.

[0058] In a variation of the embodiment, the step of expressing thelibrary of tester protein complexes includes transforming into the yeastcells a library of tester expression vectors for the library of testerfusion proteins. Each of the tester expression vectors comprises atranscription sequence encoding an activation domain of a transcriptionactivator, a first nucleotide sequence encoding the first polypeptidesubunit, and a second nucleotide sequence encoding the secondpolypeptide subunit, the first and second nucleotide sequences varyingindependently within the library of tester expression vectors. Thetranscriptional activation domain and the first polypeptide subunit areexpressed as a fusion protein. The first and second polypeptide subunitsare expressed as separate proteins, and form the tester protein complexupon binding with each other through non-covalent interactions (e.g.hydrophobic interactions) or covalent interactions (e.g. disulfidebonds).

[0059] In another variation of the embodiment, the step of expressing alibrary of tester protein complexes in yeast cells includes causingmating between a first and second populations of haploid yeast cells ofopposite mating types. The first population of haploid yeast cellscomprises a library of tester expression vectors for the library oftester protein complexes described above. The second population ofhaploid yeast cells comprises the reporter construct.

[0060] According to the variation, the haploid yeast cells of oppositemating types may preferably be α and a type strains of yeast. The matingbetween the first and second populations of haploid yeast cells of α anda type strains is preferably conducted in a rich nutritional culturemedium.

[0061] According to any of the above-described methods for selectingprotein-DNA binding pairs, the target DNA sequence in the reporterconstruct is preferably positioned in 2-6 tandem repeats 5′ relative tothe reporter gene. The target DNA sequence in the reporter construct ispreferably between about 15-75 bp in length and more preferably betweenabout 25-55 bp in length.

[0062] In yet another embodiment, a method is provided for screeningprotein-protein binding pairs in a yeast one-hybrid system. The methodcomprises: expressing a library of tester protein complexes in yeastcells which contain a reporter construct comprising a reporter genewhose expression is under a transcriptional control of a specific DNAbinding site; expressing a target protein in the yeast cells expressingthe tester protein complexes, where the target protein binds to thespecific DNA binding site; and selecting the yeast cells in which thereporter gene is expressed, the expression of the reporter gene beingactivated by binding of the tester protein complex to the targetprotein.

[0063] In a variation of the embodiment, the step of expressing thelibrary of tester protein complexes includes transforming into the yeastcells a library of tester expression vectors for the library of testerfusion proteins. Each of the tester expression vectors comprises atranscription sequence encoding an activation domain of a transcriptionactivator, a first nucleotide sequence encoding the first polypeptidesubunit, and a second nucleotide sequence encoding the secondpolypeptide subunit, the first and second nucleotide sequences varyingindependently within the library of tester expression vectors. Thetranscriptional activation domain and the first polypeptide subunit areexpressed as a fusion protein. The first and second polypeptide subunitsare expressed as separate proteins, and form the tester protein complexupon binding with each other through non-covalent interactions (e.g.hydrophobic interactions) or covalent interactions (e.g. disulfidebonds).

[0064] In another variation of the embodiment, the steps of expressingthe library of tester protein complexes and expressing the target fusionprotein includes causing mating between a first and second populationsof haploid yeast cells of opposite mating types. The first population ofhaploid yeast cells comprises a library of tester expression vectors forthe library of tester protein complexes described above. The secondpopulation of haploid yeast cells comprises a target expression vectorcomprising a target sequence encoding the target protein. Either thefirst or second population of haploid yeast cells comprises the reporterconstruct.

[0065] In any of the above-described methods for selecting testerproteins capable of binding to a target peptide, protein, or DNA, themethod may further comprise isolating the tester expression vectors fromthe selected yeast cells; and mutagenizing the first and secondnucleotide sequences in the isolated tester expression vectors to form alibrary of mutagenized expression vectors.

[0066] Examples of mutagenesis methods include, but are not limited to,error-prone PCR mutagenesis, site-directed mutagenesis, DNA shufflingand combinations thereof. The library of mutagenized expression vectorsmay be screened against the same or different target peptide, protein orDNA by following similar procedures used for screening the testerexpression vectors.

[0067] In yet another aspect of the present invention, methods areprovided for producing a library of assembled antibodies. Examples ofthe assembled antibodies include, but are not limited to, a double-chainprotein complex (dcFv) formed between the variable regions of the lightchain (V_(L)) and heavy chain (V_(H)), the Fab (fragmentantigen-binding) fragments, and a fully assembled antibody having boththe variable and constant regions of the light chain and heavy chain.

[0068] In an embodiment, the method comprises: expressing in cells alibrary of expression vectors. Each of the expression vectors comprisesa first nucleotide sequence encoding a first polypeptide subunitcomprising an antibody heavy chain variable region, a second nucleotidesequence encoding a second polypeptide subunit comprising an antibodylight chain variable region. The first and second polypeptide subunitsare expressed as separate proteins and self assembled to form a dcFv,Fab, or a full antibody upon interacting with each other. Also, thefirst and second nucleotide sequences each independently varies withinthe library of expression vectors to generate a library of assembledantibodies with a diversity of at least 10⁷.

[0069] According to the embodiment, the diversity of the library ofassembled antibodies is preferably between 10⁶-10¹⁶, more preferablybetween 10⁸-10¹⁶, and most preferably between 10¹⁰-10¹⁶.

[0070] The cells may be prokaryotic or eukaryotic cells, such asbacteria, yeast, insect, plant and mammalian cells. In a preferredembodiment, the cells where the library of antibodies are expressed areyeast cells.

[0071] In yet another aspect of the present invention, a kit is providedfor selecting tester proteins capable of binding to a target peptide,protein, or DNA.

[0072] In an embodiment, a kit is provided which comprises: a library oftester expression vectors and a yeast cell line. Each of the testerexpression vectors comprises a first transcription sequence encodingeither an activation domain or a DNA binding domain of a transcriptionactivator, a first nucleotide sequence encoding a first polypeptidesubunit, and a second nucleotide sequence encoding a second polypeptidesubunit, the first and second nucleotide sequences each independentlyvarying within the library of expression vectors. The first and secondpolypeptide subunits are expressed as separate proteins and form aprotein complex upon interacting with each other. A reporter constructmay be contained in the yeast cell line. The reporter constructcomprises a reporter gene whose expression is under a transcriptionalcontrol of a specific DNA binding site.

[0073] Optionally, the kit may further comprise a target expressionvector which comprises a second transcription sequence encoding eitherthe activation domain or the DNA binding domain of the transcriptionactivator which is not expressed by the library of tester expressionvectors; and a target sequence encoding the target protein or peptide.

[0074] In another embodiment, the kit comprises: first and secondpopulations of haploid yeast cells of opposite mating types. The firstpopulation of haploid yeast cells comprises a library of testerexpression vectors for the library of tester fusion proteins. Each ofthe tester expression vectors comprises a first transcription sequenceencoding either an activation domain or a DNA binding domain of atranscription activator, a first nucleotide sequence encoding a firstpolypeptide subunit, and a second nucleotide sequence encoding a secondpolypeptide subunit, the first and second nucleotide sequences eachindependently varying within the library of expression vectors. Thefirst and second polypeptide subunits are expressed as separate proteinsand form a protein complex upon interacting with each other. The secondpopulation of haploid yeast cells comprises a target expression vector.The target expression vector encodes either the activation domain or theDNA binding domain of the transcription activator which is not expressedby the library of tester expression vectors; and a target sequenceencoding the target protein or peptide. Either the first or secondpopulation of haploid yeast cells comprises a reporter constructcomprising a reporter gene whose expression is under transcriptionalcontrol of the transcription activator.

[0075] Optionally, the second population of haploid yeast cellscomprises a plurality of target expression vectors. Each of the targetexpression vectors encodes either the activation domain or the DNAbinding domain of the transcription activator which is not expressed bythe library of tester expression vectors; and a target sequence encodingthe target protein or peptide. Either the first or second population ofhaploid yeast cells comprises a reporter construct comprising a reportergene whose expression is under transcriptional control of thetranscription activator.

[0076] According to any of the above-described compositions, methods andkits, the diversity of the first and/or the second polypeptide subunitencoded by the first and second nucleotide sequences within the libraryof expression vectors is preferably between 10³-10⁸, more preferablybetween 10⁴-10⁸, and most preferably between 10⁵-10⁸.

[0077] Also according to any of the above-described compositions,methods and kits, the diversity of the protein complexes encoded by thelibrary of expression vectors may be preferably at least 10⁶-10¹⁸, morepreferably at least 10⁹-10¹⁸, and most preferably at least 10¹⁰-10¹⁸.

[0078] Also according to any of the above-described compositions,methods and kits, the diversities of the first and second polypeptidesubunits may be each independently derived from libraries of precursorsequences that are not specifically designed for the target peptide,protein or DNA.

[0079] Also according to any of the above-described compositions,methods and kits, the diversities of the first and second polypeptidesubunits optionally are not derived from one or more proteins that areknown to bind to the target peptide, protein or DNA.

[0080] Also according to any of the above-described compositions,methods and kits, the diversities of the first and second polypeptidesubunits optionally are not generated by mutagenizing one or moreproteins that are known to bind to the target peptide, protein or DNA.

[0081] Also according to any of the above-described compositions,methods and kits, the first and the second polypeptide subunits may besubunits of a multimeric protein whose sequence varies within a libraryof multimeric proteins. Examples of multimeric proteins include, but arenot limited to, growth factor receptors, T cell receptors, cytokinereceptors, tyrosine kinase-associated receptors, and MHC proteins.

[0082] Also according to any of the above-described compositions,methods and kits, the first nucleotide sequence in the library ofexpression vectors comprises a coding sequence of an antibodyheavy-chain variable region (V_(H)) or an antibody heavy-chain includingboth the variable and constant regions (V_(H)+C_(H), C_(H) includingC_(H)1, C_(H)2, and C_(H)3). The second nucleotide sequence comprises acoding sequence of an antibody light-chain variable region (V_(L)) or anantibody light-chain including both the variable and constant region(V_(L)+C_(L)).

[0083] Alternatively, the first nucleotide sequence in the library ofexpression vectors comprises a coding sequence of an antibodylight-chain variable region (V_(L)) or an antibody light-chain includingboth the variable and constant region (V_(L)+C_(L)). The secondnucleotide sequence comprises a coding sequence of an antibodyheavy-chain variable region (V_(H)) or an antibody heavy-chain includingboth the variable and constant regions (V_(H)+C_(H), C_(H) includingC_(H)1, C_(H)2, and C_(H)3).

[0084] The source of the coding sequences of the antibody light-chainand heavy-chain variable and constant regions is preferably from human,non-human primate, or rodent. Optionally, the source of the codingsequences of the antibody light-chain and heavy-chain variable andconstant regions may be from one or more non-immunized animals.Preferably, the source of the coding sequences of the antibodylight-chain and heavy-chain variable and constant regions may be fromhuman fetal spleen, lymph nodes or peripheral blood cells.

[0085] Also according to any of the above-described compositions,methods and kits, the first and second polypeptide subunits may eachfurther comprise a plurality of cysteine residues, preferably 2-8 Cysresidues, at or adjacent the N- or C-terminus of the polypeptide. It isbelieved that by adding more cysteine subunits near the termini of thesubunits, the intermolecular interactions between the two subunitsshould be enhanced through formation of Cys-Cys disulfide bonds, thusfurther stabilizing the assembly of the protein complex formed by thetwo subunits.

[0086] Alternatively, the first and second polypeptide subunits may eachfurther comprise a “zipper” domain at or adjacent the N- or C-terminusof the polypeptide. As used herein, a “zipper domain” refers to aprotein or peptide structural motif that can interact with another“zipper domain” with a different sequence to form a hetero-polymer suchas a heterodimer. It is believed that by adding a zipper domain near thetermini of the subunits, the intermolecular interactions between the twosubunits should be enhanced through non-covalent interactions (e.g.hydrophobic interactions), thus further stabilizing the assembly of theprotein complex formed by the two subunits.

[0087] In addition, the first or the second polypeptide subunit mayfurther comprise a “bundle” domain at or adjacent the C-terminus of thepolypeptide. As used herein, a “bundle domain” refers to a protein orpeptide structural motif that can interact with itself to form ahomo-polymer such as a homopentamer. The bundle domains bring theprotein complex together by polymerization through non-covalentinteractions such as coiled-coil interactions. It is believed thatpolymerization of the protein complex should enhance the avidity of theprotein complexes to their binding target through multivalent binding.For example, avidity of antibody of the present invention may bedramatically increased by fusing a bundle domain (e.g. the coiled-coildomain of the cartilage oligomeric matrix protein) to the C-terminus ofthe heavy chain via a semi-rigid linker.

[0088] Also, the first or second polypeptide subunit may furthercomprise a signaling domain for screening the library of the proteincomplexes based non-conventional two-hybrid methods such as the SRS (Sosrecruitment system) and RRS (Ras Recruitment System). Examples of suchsignaling domain includes but are not limited to a Ras guanyl nucleotideexchange factor (e.g. human SOS factor), a membrane targeting signalsuch as a myristoylation sequence and farnesylation sequence, mammalianRas lacking the carboxy-terminal domain (the CAAX box), and a ubiquitinsequence.

[0089] Also according to any of the above-described compositions,methods and kits, each of the expression vectors may further comprise asequence encoding an affinity tag. Examples of affinity tags include,but are not limited to, polyhistidine tags, polyarginine tags,glutathione-S-transferase, maltose binding protein, staphylococcalprotein A tag, and EE-epitope tags.

[0090] Also according to any of the above-described compositions,methods and kits, the transcription activator may be any transcriptionactivator having separable DNA-binding and transcriptional activationdomains. Examples of transcription activators include, but are notlimited to, GAL4, GCN4, and ADR1 transcription activators.

[0091] Also according to any of the above-described compositions,methods and kits, the reporter protein encoded by the reporter gene maybe any reporter genes whose expression shows a distinct genotype orphenotype in a cell. Examples of such a reporter protein include, butare not limited to, β-galactosidase, α-galactosidase, luciferase,β-glucuronidase, chloramphenicol acetyl transferase, secreted embryonicalkaline phosphatase, green fluorescent protein, enhanced bluefluorescent protein, enhanced yellow fluorescent protein, and enhancedcyan fluorescent protein.

BRIEF DESCRIPTION OF FIGURES

[0092]FIG. 1A illustrates a flow chart of a process that may be used inthe present invention to screen for high affinity antibodies in a yeasttwo-hybrid system.

[0093]FIG. 1B illustrates a flow chart of a process that may be used inthe present invention to screen for high affinity antibodies displayedon the surface of yeast cells.

[0094]FIG. 2 illustrates an embodiment of a method for generating alibrary of expression vectors by sequentially inserting V1 and V2fragments into a linearized expression vector via homologousrecombination.

[0095]FIG. 3 illustrates an embodiment of a method for generating alibrary of expression vectors by inserting V1 fragment into anexpression vector through directional cloning in bacteria and byinserting V2 segment into the linearized expression vector viahomologous recombination in yeast.

[0096]FIG. 4 illustrates an embodiment of a method or selectingprotein-protein binding pair in a two-hybrid system where the expressionvectors carrying the AD and BD domains are co-transformed orsequentially transformed into yeast.

[0097]FIG. 5 illustrates an embodiment of the method for selectingprotein-protein binding pairs in a two-hybrid system where theexpression vectors carrying the AD and BD domains are introduced intodiploid yeast cells via mating between two haploid yeast strains ofopposite mating types.

[0098]FIG. 6 illustrates an embodiment of a method for selectingprotein-DNA binding pair in a one-hybrid system where the expressionvector carrying the AD domain is transformed into yeast.

[0099]FIG. 7 illustrates an embodiment of the method for selectingprotein-protein binding pairs in a one-hybrid system where theexpression vector carrying the AD domain is transformed into yeast.

[0100]FIG. 8 illustrates an embodiment of a high throughput method forselecting protein-protein binding pairs in a two-hybrid system where thelibrary of the tester expression vectors and the library of expressionvector carrying the target expression vectors are each arrayed inmulti-well plates.

[0101]FIG. 9 illustrates an embodiment of a method used for mutagenesisand further screening of the clones selected from a primary screening ofthe tester protein complexes carried by the expression vector of thepresent invention.

[0102]FIG. 10A illustrates secondary structures of double-chain variablefragments (dcFv), antibody fragments (Fab), and a fully-assembledantibody (Ab).

[0103]FIG. 10B illustrates secondary structures of dcFv, Fab, and Abwith zipper domains attached to the heavy chain and light regions.

[0104]FIG. 10C illustrates secondary structures of clusters of dcFv,Fab, and Ab with bundle domains attached to the heavy chain region.

[0105]FIG. 10D illustrates secondary structures of clusters of dcFv,Fab, or Ab with bundle domains attached to the heavy chain region via alinker.

[0106]FIG. 11 illustrates examples of functional expression systems forantibody selected by using the method of the present invention.

[0107]FIG. 12A illustrates the plasmid map of pACT2.

[0108]FIG. 12B illustrates the plasmid map of pBridge.

[0109]FIG. 12C depicts a method of modifying pACT2 in order to introduceanother expression vector derived from pBridge into the plasmid toproduce a yeast expression vector having double expression cassette(designated pACT2-DC).

[0110]FIG. 12D illustrates the plasmid map of pYD1.

DETAILED DESCRIPTION OF THE INVENTION

[0111] The present invention provides novel compositions, kits andefficient methods for preparing extremely diverse libraries of testerprotein complexes, and selecting from these libraries proteins with highaffinity and specificity toward a target protein, peptide or DNA invivo. One feature of the present invention is the production of two ormore polypeptide in vivo which self-assemble to form a protein complexin vivo. The in vivo formed protein complex is then tested in the samein vivo system for the complex's ability to bind to either a protein ora nucleotide sequence,(DNA or RNA). The ability to express polypeptides,form protein complexes of those polypeptides, and screen the proteincomplexes all in the same intracellular system enables the presentinvention to screen large populations of protein complexes for bindingwith high throughput.

[0112] In one particular embodiment, highly diverse libraries of humanantibodies can be produced and screened against virtually any targetantigen by using the compositions, kits and methods of the presentinvention.

[0113] The present invention provides a general method for screeningthese diverse libraries of tester protein complexes against a single ora plurality of target proteins or peptides.

[0114] The method comprise: expressing a library of tester proteincomplexes in yeast cells, each tester protein complexes being formedbetween a first polypeptide subunit whose sequence varies within thelibrary, and a second polypeptide subunit whose sequence varies withinthe library independently of the first polypeptide; expressing one ormore target fusion proteins in the yeast cells expressing the testerproteins, each of the target fusion proteins comprising a target peptideor protein; and selecting those yeast cells in which a reporter gene isexpressed, the expression of the reporter gene being activated bybinding of the tester protein complex to the target fusion protein.

[0115] The library of tester protein complexes may be any multimericproteins wherein the first and second polypeptide subunit are subunitsof a multimeric protein whose sequence varies within the library oftester protein complexes.

[0116] The first and second polypeptide subunits are expressed asseparate proteins by various mechanisms, such as expression fromseparate promoters and by expressing bicistronically from the samepromoter via an internal ribosomal entry site (IRES, Paz et al. (1999)J. Biol. Chem. 274:21741-21745) or via a splicing donor-acceptormechanism. The first and second subunits form a tester protein complexupon binding with each other through non-covalent interactions (e.g.hydrophobic interactions) or covalent interactions (e.g. disulfidebonds). Since the sequences of the first and second polypeptide subunits(with a complexity of 10^(x) and 10^(y), respectively) varyindependently within the library of the tester protein complexes, thecomplexity of the library of the protein complexes formed as a result ofbinding between the first and second polypeptide subunits should be10^(x+y) theoretically.

[0117] In a preferred embodiment, the library of tester proteincomplexes is a library of antibodies where the first and secondpolypeptide subunits comprise antibody heavy chain and light chainsequences, respectively. Alternatively, the library of tester proteincomplexes is a library of antibodies where the first and secondpolypeptide subunits comprise antibody light chain and heavy chainsequences, respectively. The first polypeptide subunit may comprise anantibody heavy-chain variable region (V_(H)) or an antibody heavy-chainincluding both the variable and constant regions (V_(H)+C_(H), C_(H)including C_(H)1, C_(H)2, and C_(H)3). The second nucleotide sequencemay comprise an antibody light-chain variable region (V_(L)) or anantibody light-chain including both the variable and constant region(V_(L+C) _(L)). These light chain and heavy fragments are assembled inyeast cells to form a double-chain protein complex (dcFv) between V_(L)and V_(H), a Fab (fragment antigen-binding) fragments between(V_(L)+C_(L)) and (V_(H)+C_(H)1), and a fully assembled antibody formedbetween (V_(L)+C_(L)) and (V_(H)+C_(H)1+C_(H)2+C_(H)3).

[0118] The source of the coding sequences of the antibody light chainand heavy chain may be from humans, non-human primates, or rodents. Forexample, the source of the antibody coding sequences may be cDNAlibraries derived from human spleen, peripheral white blood cells, fetalliver, and bone marrow.

[0119] From these libraries of antibodies, antibodies with high affinityand specificity are selected by screening against the libraries singleor a plurality of target antigens and antibodies, in particular, inyeast. Compared to conventional approaches of generating monoclonalantibody by hybridoma technology and the recently developed XENOMOUSE®technology, the present invention provides a more efficient andeconomical way to screen for fully human antibodies in a much shorterperiod of time. More importantly, the production and screening of theantibody libraries can be readily adopted for high throughput screeningin vivo.

[0120] The library of the tester protein complexes may be produced invivo or in vitro by using any methods known in the art. The presentinvention provides a novel method for generating and screening librariesof expression vectors encoding these tester proteins against a single ora plurality of target molecules in vivo. These methods are developed byexploiting the intrinsic property of yeast—homologous recombination atan extremely high level of efficiency.

[0121]FIG. 1A shows a flow chart delineating a preferred embodiment ofthe above method of the present invention for generating and screeninghighly diverse libraries of human antibodies or antibody fragments inyeast. As illustrated in FIG. 1A, a highly complex library of humanantibody is constructed in yeast cells. In particular, cDNA libraries ofthe heavy chain and light chain are transferred into a yeast expressionvector by direct homologous recombination between the sequences encodingthe heavy chain or the light chain and the yeast expression vectorcontaining homologous recombination sites. The resulting expressionvector is called Ab expression vector. This primary antibody library mayreach a diversity preferably between 10⁸-10¹⁴, more preferably between10¹⁰-10¹², and most preferably between 10¹²-10¹⁴.

[0122] These highly complex primary antibody libraries can be used in awide variety of applications. In particular, this library is used forscreening of fully human antibody against a wide variety of targets,such as a defined antigen or a library of antigens associated withdiseases.

[0123] The screening for antibody-antigen interaction may beconveniently carried out in yeast by using a yeast two-hybrid method.For example, a library of Ab expression vectors are introduced intoyeast cells. Expression of the antibody library in the yeast cellsproduces a library of assembled antibody (the tester protein complexes)with either the heavy chain or the light chain fused with an activationdomain (AD) of a transcription activator. The yeast cells are alsomodified to express a recombinant fusion protein comprising aDNA-binding domain (BD) of the transcription activator and a targetantigen. The yeast cells are also modified to express a reporter genewhose expression is under the control of a specific DNA binding site.Upon binding of the antibody from the library to the target antigen, theAD is brought into close proximity of BD, thereby causingtranscriptional activation of a reporter gene downstream from a specificDNA binding site to which the BD binds. It is noted that the library ofAb expression vectors may contain the BD domain while the modified yeastcells express a fusion protein comprising the AD domain and the targetantigen.

[0124] These Ab expression vectors may be introduced to yeast cells byco-transformation of diploid yeast cells or by direct mating between twostrains of haploid yeast cells. For example, the Ab expression vectorscontaining libraries of V_(H) and V_(L) and an expression vectorcontaining the target antigen can be used to co-transform diploid yeastcells in a form of yeast plasmid or bacteria-yeast shuttle plasmid.Alternatively, two strains haploid yeast cells (e.g. α- and a-typestrains of yeast), each containing the Ab expression vector and thetarget antigen expression vector, respectively, are mated to produce adiploid yeast cell containing both expression vectors. Preferably, thehaploid yeast strain containing the target antigen expression vectoralso contains the reporter gene positioned downstream of the specificDNA binding site.

[0125] The yeast clones containing antibodies with binding affinity tothe target antigen are selected based on phenotypes of the cells orother selectable markers. The plasmids encoding these primary antibodyleads can be isolated and further characterized.

[0126] Alternatively, the first polypeptide subunit and/or the secondpolypeptide can be expressed as a fusion protein with a cellwall/membrane protein, such as the yeast agglutinin Aga2p cell wallprotein. Such a fusion allows transportation of the protein complex(e.g. antibody) formed between the first and second subunits to the cellwall/membrane, thus effectively mimicking the cell surface display ofantibodies by B cells in the immune system for affinity maturation invivo.

[0127]FIG. 1B depicts a general scheme for this alternative method ofselection of antibodies displayed on the surface of yeast cells. Asillustrated in FIG. 1B, the primary antibody library contains antibodyvariants having the heavy chain region fused to the C-terminus of ayeast agglutinin protein such as the yeast Aga2 subunit of a-agglutinin.Shusta et al. (1999) “Yeast polypeptide fusion surface display levelspredict thermal stability and soluble secretion efficiency” J. Mol.Biol. 292:949-956.

[0128] Transportation of the antibody by the yeast cell wall proteinallows the antibody library to be displayed on the surface oftransformed yeast cells. One or more target molecules such asfluorescence-labeled antigen(s)s are added to the cells. The cellsdisplaying antibodies that bind to the antigen(s) can be convenientlyselected by using fluorescence-activated cell sorting (FACS) or by usingmagnetic beads to isolate these cells.

[0129] After the selection of the primary library of human antibodies byusing a yeast two-hybrid method or a yeast cell surface display method,the sequences encoding V_(H) and V_(L) of the primary antibody leads aremutagenized in vitro to produce a secondary antibody library. The V_(H)and V_(L) sequences can be randomly mutagenized by “poison” PCR (orerror-prone PCR), by DNA shuffling, or by any other way of random orsite-directed mutagenesis (or cassette mutagenesis). After mutagenesisin the regions of V_(H) and V_(L), the complexity of the secondaryantibody library may reach 10⁴ or more. Overall, the combined diversityor complexity of the total antibody libraries generated by using themethods of the present invention, including the primary and thesecondary antibody libraries, may reach 10¹⁸ or more. The secondaryantibody library are further screened for antibodies that bind thetarget antigen at high affinity by using the yeast-2-hybrid method asdescribed above or other methods of screening in vivo or in vitro.

[0130] An advantage of the present invention is that the overall processof generating, selecting and optimizing large, diverse libraries ofantibodies mimics the process of natural antibody diversification andmaturation in a mammal. In the natural process of antibody affinitymaturation, the affinity of the antibodies against their antigen(s) isprogressively increased with the passage of time after immunization,largely due to the accumulation of point mutations specifically in thecoding sequences of both the heavy- and light-chain variable regions.

[0131] According to the present invention, extensive diversification isachieved by recombination and mutagenesis of the V_(H) and V_(L) chainlibraries derived from a wide variety of sources including natural andartificial or synthetic sources. The homologous combination of V_(H) andV_(L) in vivo to form the primary library of single-chain antibodiesmimics the natural process of antibody gene assembly from differentpools of gene segments encoding V_(H) and V_(L) of the antibodies. Sincethe method is preferably practiced with yeast cells, the highlyefficient homologous recombination in yeast is particularly useful tofacilitate such assembly of V_(H) and V_(L) in vivo.

[0132] The fast proliferation rate of yeast cells and ease of handlingmakes a process of “molecular evolution” dramatically shorter than thenatural process of antibody affinity maturation in a mammal. Therefore,antibody repertoires with extremely high diversity can be produced andscreened directly in yeast cells at a much lower cost and higherefficiency than prior processes such as the painstaking, stepwise“humanization” of monoclonal murine antibodies isolated by using theconventional hybridoma technology (a “protein redesign”) or therecently-developed XENOMOUSE™ technology.

[0133] According to the “protein redesign” approach, murine monoclonalantibodies of desired antigen specificity are modified or “humanized” invitro in an attempt to reshape the murine antibody to resemble moreclosely its human counterpart while retaining the originalantigen-binding specificity. Riechmann et al. (1988) Nature 332:323-327.This humanization demands extensive, systematic genetic engineering ofthe murine antibody, which could take months, if not years.Additionally, extensive modification of the backbone of the murinemonoclonal antibody may result in reduced specificity and affinity.

[0134] In comparison, by using the method of the present invention,fully human antibodies with high affinity to a specified antigen orantigens can be screened and isolated directly from yeast cells withoutgoing through site-by-site modification of the antibody, and withoutsacrifice of specificity and affinity of the selected antibodies.

[0135] The XENOMOUSE™ technology has been used to generate fully humanantibodies with high affinity by creating strains of transgenic micethat produce human antibodies while suppressing the endogenous murine Igheavy- and light-chain loci. However, the breeding of such strains oftransgenic mice and selection of high affinity antibodies can take along period of time. The antigen against which the pool of the humanantibody is selected has to be recognized by the mouse as a foreignantigen in order to mount immune response; antibodies against a targetantigen that does not have immunogenicity in a mouse may not be able tobe selected by using this technology.

[0136] In contrast, by using the method of the present invention,libraries of antibody can not only be generated at a great diversity andcomplexity in yeast cells more efficiently and economically, but also bescreened against virtually any protein or peptide target regardless ofits immunogenicity. According to the present invention, anyprotein/peptide target can be expressed as a fusion protein with aDNA-binding domain (or an activation domain) of a transcriptionactivator and selected against the library of antibody in ayeast-2-hybrid system. Moreover, multiple protein targets or a libraryof antigens may be arrayed in multiple-well plates and screened againstthe library of antibodies in a high throughput and automated manner.

[0137] Also compared to other approaches using transgenic goats andchickens to produce antibodies, the method of the present invention canbe used to screen and produce fully human antibodies in large amountswithout involving serious regulatory issues regarding the use oftransgenic animals, as well as safety issues concerning containment oftransgenic animals infected with recombinant viral vectors.

[0138] By using the method of the present invention, many requisitesteps in the traditional construction of cDNA libraries can beeliminated. For example, the time-consuming and labor-intensive steps ofligation and recloning of cDNA libraries into expression vectors can beeliminated by direct recombination or “gap-filling” in yeast throughgeneral homologous recombination and/or site-specific recombination.Throughout the whole process of antibody library construction, the DNAfragments encoding antibody heavy chain and light chain are directlyincorporated into a linearized yeast expression vector via homologousrecombination without the recourse to extensive recloning.

[0139] Compared with the approach of using phage display to screen forhigh affinity antibodies in vitro, the method of the present inventionprovides efficient ways of screening for high affinity antibodies ineukaryotic cells in vivo. By using phage display technology, human Igheavy chain and light chain variable regions are cloned, combinatoriallyreasserted, expressed and displayed as antigen-binding human Fab or scFvfragements on the surface of filamentous phage. Winter et al. (1994)Ann. Rev. Immunol. 433-455; and Rader et al. (1997) Current Opinion inBiotechnol. 8:503-508. The phage-displayed human antigen-bindingfragments are then screened for their ability to bind an immobilizedtarget antigen in vitro, a process called biopanning. When high affinityhuman antibodies are desired, the phage display approach can beproblematic, presumably due to non-native conformation of antibodydisplay on the surface and/or extensive selection or panning requiredfor selection under in vitro conditions which bear little resemblance tothe physiological condition of a human body. In contrast, by using themethod of the present invention antibodies are selected based on theirbinding affinity to the target antigen in vivo. The antibodies areexpressed in the cell, go through protein folding, and binds to itstarget antigen under a natural environment. Thus, the antibodiesselected by using the method of the present invention should be morefunctionally relevant than those selected by panning in vitro.

[0140] 1. Libraries of the Expression Vectors of the Present Invention

[0141] The present invention provides a library of expression vectors.In one embodiment, a library of yeast expression vectors is provided.The yeast expression vectors forming in the library comprise a firstnucleotide sequence V1 encoding a first polypeptide subunit; and asecond nucleotide sequence V2 encoding a second polypeptide subunit, thefirst and second nucleotide sequence each independently varying withinthe library of expression vectors.

[0142] According to the embodiment, the first polypeptide subunit andthe second polypeptide subunit can be expressed as separate proteins orpeptides. This may be accomplished by expressing the first and secondpolypeptide subunits from separate promoters, or by expressingbicistronically from the same promoter via an internal ribosomal entrysite (IRES) or via a splicing donor-acceptor mechanism.

[0143] According to the embodiment, the yeast expression vector may be a2μ plasmid vector or a yc-type (centromeric) yeast vector, preferably ayeast-bacterial shuttle vector which contains a bacterial origin ofreplication.

[0144] Also according to the embodiment, V1 in the library of expressionvectors comprises a coding sequence of an antibody heavy-chain variableregion (V_(H)) or an antibody heavy-chain including both the variableand constant regions (V_(H)+C_(H), C_(H) including C_(H)1, C_(H)2, andC_(H)3). V2 comprises a coding sequence of an antibody light-chainvariable region (V_(L)) or an antibody light-chain including both thevariable and constant region (V_(L)+C_(L)).

[0145] Alternatively, V1 in the library of expression vectors comprisesa coding sequence of an antibody heavy-chain variable region (V_(L)) oran antibody light-chain including both the variable and constant region(V_(L)+C_(L)). V2 comprises a coding sequence of an antibody heavy-chainvariable region (V_(L)) or an antibody heavy-chain including both thevariable and constant regions (V_(H)+C_(H), C_(H) including C_(H),C_(H)2, and C_(H)3).

[0146] When V1 and V2 are expressed by the yeast expression vector inyeast cells, such as cells from the Saccharomyces cerevisiae strains,the protein subunits comprising the V1 and V2 polypeptide segmentsrespectively interact with each other through non-covalent interactions(e.g. hydrophobic interactions) or covalent interactions (e.g. disulfidebonds) to form a double-chain protein complex.

[0147] Optionally, the first and second polypeptide subunits may eachfurther comprise a plurality of cysteine residues, preferably 2-8 Cysresidues. The additional cysteines residues may be located at oradjacent the N- or C-terminus of the first and second polypeptidesubunits. As illustrated in FIG. 10A, the additional cysteines residuesis preferably located near the C-terminus of the heavy chain and lightchain regions of a dcFv, Fab and a fully assembled antibody.

[0148] It is believed that by adding more cysteine subunits near thetermini of the subunits, the intermolecular interactions between the twosubunits should be enhanced through formation of Cys-Cys disulfidebonds, thus further stabilizing the assembly of the protein complexformed by the two subunits.

[0149] Alternatively, the first and second polypeptide subunits may eachfurther comprise a “zipper” domain at or adjacent the N- or C-terminusof the polypeptide. As illustrated in FIG. 10B, the zipper domain ispreferably located at the C-terminus of the heavy chain and light chainregions of a dcFv, Fab and a fully assembled antibody.

[0150] A zipper domain is a protein or peptide structural motif thatinteracts with each other through non-covalent interactions such ascoiled-coil interactions and brings other proteins fused with the zipperdomains into close proximity. Examples of zipper domains include, butare not limited to, leucine zippers (or helix-loop-helix, also calledbHLHzip motif) formed between the nuclear oncoproteins Fos and Jun(Kouzarides and Tiff (1989) “Behind the Fos and Jun leucine zipper’Cancer Cells 1: 71-76); leucine zippers formed betweenproto-oncoproteins Myc and Max (Luscher and Larsson (1999) “The basicregion/helix-loop-helix/leucine zipper domain of Myc proto-oncoproteins:function and regulation” Oncogene 18:2955-2966); zipper motifs fromadhesion proteins such as N-terminal domain of neural cadherin (Weis(1995) “Cadherin structure: a revealing zipper” 3:425-427); zipper-likestructural motifs from collagen triple helices or cartilage oligomericmatrix proteins (Engel and Prockop “The zipper-like folding of collagentriple helices and the effects of mutations that disrupt the zipper”Annu. Rev. Biophys. Biophys. Chem. 20:137-152; and Terskikh et al.(1997) “Peptabody”: a new type of high avidity binding protein” Proc.Natl. Acad. Sci. USA 94:1663-1668).

[0151] The zipper domain may be fused to the N- or C-terminus of thepolypeptide subunits, preferably at the C-terminus of the subunits. Forexample, the leucine zipper domain derived from the oncoprotein Jun canbe expressed as a fusion protein with an antibody heavy chain whereasthe leucine zipper domain derived from the oncoprotein Fos can beexpressed as another fusion protein with an antibody light chain. Sincethe Jun and Fos leucine zipper domains can bind to each other with highaffinity, the antibody heavy chain and light chain fused with Jun andFos zipper, respectively, can be brought into close proximity and form aheterodimer upon binding between these two zipper domains.

[0152] It is believed that by adding a zipper domain near the termini ofthe subunits, the intermolecular interactions between the two subunitsshould be enhanced through non-covalent interactions (e.g. hydrophobicinteractions), thus further stabilizing the assembly of the proteincomplex formed by the two subunits. Moreover, fusing a zipper domainderived from nuclear protein such as Jun and Fos to the subunits mayfacilitate efficient transportation of the subunits to the nucleus wherethe protein complex formed between the two subunits performs desiredfunctions such as transcriptional activation of a reporter gene.

[0153] In addition, the first or the second polypeptide subunit mayfurther comprise a “bundle” domain at or adjacent the C-terminus of thepolypeptide. As used herein, a “bundle domain” refers to a protein orpeptide structural motif that can interact with itself to form ahomo-polymer such as a homopentalmer. As illustrated in FIG. 10C, thebundle domains bring the protein complex together by polymerizationthrough non-covalent interactions such as coiled-coil interactions. Itis believed that polymerization of the protein complex should enhancethe avidity of the protein complexes to their binding target throughmultivalent binding.

[0154] For example, the coiled-coil assembly domain of the cartilageoligomeric matrix protein (COMP) may serve as a bundle domain. TheN-terminal fragment of rat COMP comprises residue 20-83. This fragmentcan form pentamers simillar to the assembly domain of the nativeprotein. The fragment adopts a predominantly alpha-helical structure.Efimov et al. (1994) “The thrombospondin-like chains of cartilageoligomeric matrix protein are assembled by a five-stranded alpha-helicalbundle between residues 20 and 83” FEBS Lett. 341:54-58.

[0155] The coiled-coil domain of the nudE gene of the filamentous fungusAspergillu nidulans or the gene encoding the nuclear distributionprotein RO11 of Neurospora crassa may also serve a bundle domain. Theproduct of the nudE gene, NUDE, is a homologue of the RO11 protein. TheN-terminal coiled-coil domain of the NUDE protein is highly conserved;and a similar coiled-coil domain is present in several putative humanproteins and in the mitotic phosphoprotein 43 (MP43) of X. laevis.Efimov and Morris (2000) “The LIS1-related NUDF protein of Aspergillunidulans interacts with the coiled-coil domain of the NUDE/RO11 protein”J. Cell Biol. 150:681-688.

[0156] In addition, the coiled-coil segments or fribritin encoded bybacteriophage T4 may also serve as a bundle domain. The bacteriophage T4late gene wac (Whisker's antigen control) encodes a fibrous proteinwhich forms a collar/whiskers complex. Analysis of the 486 amino acidsequence of fibritin reveals three structural components: a 408 aminoacid region that contains 12 putative coiled-coil segments with acanonical heptad (a-b-c-d-e-f-g)n substructure where the “a” and “d”positions are preferentially occupied by apolar residues, and the N andC-terminal domains (47 and 29 amino acid residues, respectively). Thealpha-helical segments are separated by short “linker” regions, variablein length, that have a high proportion of glycine and proline residues.Co-assembly of full-length fibritin and the N-terminal deletion mutant,as well as analytical centrifugation, indicates that the protein is aparallel triple-standard alpha-helical coiled-coil. The last 18C-terminal residues of fibritin are required for correct trimerisationof gpwac monomers in vivo. Efimov et al. (1994) “Fibritin encoded bybacteriophage T4 gene wac has a parallel triple-stranded alpha-helicalcoiled-coiled structure” J. Mol. Biol. 242:470-486.

[0157] The bundle domain may be fused to the C-terminus of the first orsecond polypeptide subunit. Optionally, a semi-rigid linker may be usedto link the bundle domain to the subunit. As illustrated in FIG. 10D,this linker serves a hinge that allows a controlled conformationalflexibility of the cluster of protein complexes formed between the firstand second polypeptide subunits. For example, the 24 amino acid hingeregion derived from camel IgG, (PQ)₂PK(PQ)₄PKPQPK(PE)₂ [SEQ ID NO: 79]may be used as such a semi-rigid linker. This linker serves a hinge thatallows a controlled conformational flexibility of the cluster of proteincomplexes formed between the first and second polypeptide subunits,which provides the space necessary for multivalent binding. Further,cysteine residues may be introduced to the bundle domain, preferablynear the N-terminus, to allow the formation of additional disulfidebonds between the bundle domains.

[0158] According to this design of the present invention, avidity of theprotein complex formed between a heavy chain and light chain region ofantibody (i.e. an antibody) may be dramatically increased by fusing abundle domain (e.g. COMP) to the C-terminus of the heavy chain.Polymerization of the bundle domains should bring multiple antibodiestogether and thus enhance the avidity interactions between theantibodies with their targets due to multivalent binding. This processmimics the natural assembly of multiple IgM produced during the primaryimmune response. The low affinity of IgM is compensated by itspentameric structure resulting a high avidity toward repetitiveantigenic determinants present on the surface of bacteria or viruses.Roitt (1991) Essential Immunology (Oxford/Blackwell, London), 7^(th)Ed., pp. 65-84.

[0159] In another embodiment, a library of expression vectors isprovided. The expression vector in the library comprises: atranscription sequence encoding an activation domain AD or a DNA bindingdomain BD of a transcription activator; a first nucleotide sequence V1encoding a first polypeptide subunit; and a second nucleotide sequenceV2 encoding a second polypeptide subunit. The activation domain or theDNA binding domain of the transcription activator and the firstpolypeptide subunit are expressed as a single fusion protein. The secondpolypeptide subunit is expressed as a separate protein or peptide fromthe first polypeptide. In addition, V1 and V2 each independently varieswithin the library of expression vectors.

[0160] According to the embodiment, the expression vector may be anygene-transferring vector as long as it is able to introduce the libraryof expression vectors to a desired location within a host cell, such asby transformation, transfection and transduction of the expressionvector into a host cell. The expression vector may be a bacterial,phage, yeast, mammalian or a viral expression vector, and preferably ayeast expression vector.

[0161] Also according to the embodiment, the transcription activatorsequence may be located 5′ relative to the first nucleotide sequence.Alternatively, the transcription activator sequence may be located 3′relative to the first nucleotide sequence.

[0162] In a variation of the embodiment, V1 is a coding sequence of anantibody heavy-chain variable region (V_(H)) or an antibody heavy-chainincluding both the variable and constant regions (V_(H)+C_(H), C_(H)including C_(H)1, C_(H)2, and C_(H)3). V2 is a coding sequence of anantibody light-chain variable region (V_(L)) or an antibody light-chainincluding both the variable and constant region (V_(L)+C_(L)).

[0163] Alternatively, V1 is a coding sequence of an antibody light-chainvariable region (V_(L)) or an antibody light-chain including both thevariable and constant region (V_(L)+C_(L)). V2 is a coding sequence ofan antibody heavy-chain variable region (V_(H)) or an antibodyheavy-chain including both the variable and constant regions(V_(H)+C_(H), C_(H) including C_(H)1, C_(H)2, and C_(H)3).

[0164] Optionally, AD is an activation domain of yeast GAL 4transcription activator; and BD is a DNA binding domain of yeast GAL 4transcription activator.

[0165] When V1 and V2 are expressed by the expression vector in hostcells, such as cells from the Saccharomyces cerevisiae strains, thefusion protein comprising the AD and V1-encoded polypeptide subunit, andV2-encoded polypeptide subunit interact with each other and form aprotein complex with one or more conformations. The conformation(s)adopted by the protein complex of the AD/V1 fusion and V2-encodedpolypeptide subunit may have suitable binding site(s) for a specifictarget protein. For example, the protein complex may be dsFv, Fab or anfull antibody that binds to its specific target antigen. The AD domainof the fusion protein should be able to activate transcription ofgene(s) once the AD and BD domains are reconstituted to form an activetranscription activator in vitro or in vivo by a two-hybrid method.

[0166] According to any of the libraries described above, the diversityof the first and/or the second polypeptide subunit encoded by V1 and V2within the library of expression vectors may be preferably between10³-10⁸, more preferably between 10⁴-10⁸, and most preferably between10⁵-10⁸.

[0167] According to any of the libraries described above, the diversityof the first and/or the second polypeptide subunit encoded by V1 and V2within the library of expression vectors may be preferably at least 10³,more preferably at least 10⁴, and most preferably at least 10⁵.

[0168] Also according to any of the libraries described above, thediversity of the fusion proteins encoded by the library of expressionvectors is preferably between 10⁶-10¹⁸, more preferably between10⁹-10¹⁸, and most preferably between 10¹⁰-10¹⁸.

[0169] Also according to any of the libraries described above, thediversities of the first and second polypeptide subunits need not bederived from mutagenizing one or more proteins that are known to bind toa target peptide or protein. For example, the first and secondpolypeptide subunits need not be derived from mutagenizing a singleantibody (e.g. the antibody Herceptin®) which is known to bind to atarget peptide or protein (Her-2 receptor). This reflects a novelability of the present invention to identify new protein-protein bindingpairs from a random pool of sequences instead of having to know inadvance a protein that binds to a target and then form a library ofmutants from that known binding protein.

[0170] The elements of the expression vector in the library aredescribed in detail below.

[0171] 1) The Backbone of the Expression Vector

[0172] The expression vector of the present invention may be based onany type of vector as long as the vector that can transform, transfector transduce a host cell. The expression vector contains a library ofthe V1 sequences and a library of V2 sequences, and preferably containsa sequence encoding an activation domain (AD) of a transcriptionalactivator. The acceptor vector may be plasmids, phages or viral vectorsas long as it is able to replicate in vitro, or in a host cell, or toconvey the library of the V1 and V2 sequences to a desired locationwithin a host cell. Examples of host cells include, but are not limitedto, bacterial (e.g. E. coli, Bacillus subtilis, etc.), yeast, animal,plant, and insect cells.

[0173] In a preferred embodiment, the expression vector is based on ayeast plasmid, especially one from Saccharomyces cerevisiae. Aftertransformation of yeast cells, the exogenous DNA encoding the V1 and V2sequences are uptaken by the cells and subsequently expressed by thetransformed cells.

[0174] More preferably, the expression vector may be a yeast-bacteriashuttle vector which can be propagated in either Escherichia coli oryeast Struhl, et al. (1979) Proc. Natl. Acad. Sci. 76:1035-1039. Theinclusion of E. coli plasmid DNA sequences, such as pBR322, facilitatesthe quantitative preparation of vector DNA in E. coli, and thus theefficient transformation of yeast.

[0175] The types of yeast plasmid vector that may serve as the shuttlemay be a replicating vector or an integrating vector. A replicatingvector is yeast vector that is capable of mediating its own maintenance,independent of the chromosomal DNA of yeast, by virtue of the presenceof a functional origin of DNA replication. An integrating vector reliesupon recombination with the chromosomal DNA to facilitate replicationand thus the continued maintenance of the recombinant DNA in the hostcell. A replicating vector may be a 2μ-based plasmid vector in which theorigin of DNA replication is derived from the endogenous 2μ plasmid ofyeast. Alternatively, the replicating vector may be an autonomouslyreplicating (ARS) vector, in which the “apparent” origin of replicationis derived from the chromosomal DNA of yeast. Optionally, thereplicating vector may be a centromeric (CEN) plasmid which carries inaddition to one of the above origins of DNA replication a sequence ofyeast chromosomal DNA known to harbor a centromere.

[0176] The vectors may be transformed into yeast cells in a closedcircular form or in a linear form. Transformation of yeast byintegrating vectors, although with inheritable stability, may not beefficient when the vector is in in a close circular form (e.g. 1-10transformants per ug of DNA). Linearized vectors, with free ends locatedin DNA sequences homologous with yeast chromosomal DNA, transforms yeastwith higher efficiency (100-1000 fold) and the transforming DNA isgenerally found integrated in sequences homologous to the site ofcleavage. Thus, by cleaving the vector DNA with a suitable restrictionendonuclease, it is possible to increase the efficiency oftransformation and target the site of chromosomal integration.Integrative transformation may be applicable to the genetic modificationof brewing yeast, providing that the efficiency of transformation issufficiently high and the target DNA sequence for integration is withina region that does not disrupt genes essential to the metabolism of thehost cell.

[0177] ARS plasmids, which have a high copy number (approximately 20-50copies per cell) (Hyman et al., 1982), tend to be the most unstable, andare lost at a frequency greater than 10% per generation. However, thestability of ARS plasmids can be enhanced by the attachment of acentromere; centromeric plasmids are present at 1 or 2 copies per celland are lost at only approximately 1% per generation.

[0178] The expression vector of the present invention is preferablybased on the 2μ plasmid. The 2μ plasmid is known to be nuclear incellular location, but is inherited in a non-Mendelian fashion. Cellsthat lost the 2μ plasmid have been shown to arise from haploid yeastpopulations having an average copy number of 50 copies of the 2μ plasmidper cell at a rate of between 0.001% and 0.01% of the cells pergeneration. Futcher & Cox (1983) J. Bacteriol. 154:612. Analysis ofdifferent strains of S. cerevisiae has shown that the plasmid is presentin most strains of yeast including brewing yeast. The 2μ plasmid isubiquitous and possesses a high degree of inheritable stability innature.

[0179] The 2μ plasmid harbors a unique bidirectional origin of DNAreplication which is an essential component of all 2μ-based vectors. Theplasmid contains four genes, REP1, REP2, REP3 and FLP which are requiredfor the stable maintenance of high plasmid copy number per cell Jaysramet al. (1983) Cell 34:95. The REP1 and REP2 genes encode trans-actingproteins which are believed to function in concert by interacting withthe REP3 locus to ensure the stable partitioning of the plasmid at celldivision. In this respect, the REP3 gene behaves as a cis acting locuswhich effects the stable segregation of the plasmid, and isphenotypically analogous to a chromosomal centromere. An importantfeature of the 2μ plasmid is the presence of two inverted DNA sequencerepeats (each 559 base-pairs in length) which separate the circularmolecule into two unique regions. Intramolecular recombination betweenthe inverted repeat sequences results in the inversion of one uniqueregion relative to the other and the production in vivo of a mixedpopulation of two structural isomers of the plasmid, designated A and B.Recombination between the two inverted repeats is mediated by theprotein product of a gene called the FLP gene, and the FLP protein iscapable of mediating high frequency recombination within the invertedrepeat region. This site specific recombination event is believed toprovide a mechanism which ensures the amplification of plasmid copynumber. Murray et al. (1987) EMBO J. 6:4205.

[0180] The expression vector may also contain an Escherichia coli originof replication and E. coli antibiotic resistance genes for propagationand antibiotic selection in bacteria. Many E. coli origins are known,including ColE1, pMB1 and pBR322, The ColE origin of replication ispreferably used in this invention. Many E. coli drug resistance genesare known, including the ampicillin resistance gene, thechloramphenoicol resistance gene and the tetracycline resistance gene.In one particular embodiment, the ampicillin resistance gene is used inthe vector.

[0181] The transformants that carry the V1 and V2 sequences may beselected by using various selection schemes. The selection is typicallyachieved by incorporating within the vector DNA a gene with adiscernible phenotype. In the case of vectors used to transformlaboratory yeast, prototrophic genes, such as LEU2, URA3 or TRP1, areusually used to complement auxotrophic lesions in the host. However, inorder to transform brewing yeast and other industrial yeasts, which arefrequently polyploid and do not display auxotrophic requirements, it isnecessary to utilize a selection system based upon a dominant selectablegene. In this respect replicating transformants carrying 2μ-basedplasmid vectors may be selected based on expression of marker geneswhich mediate resistance to: antibiotics such as G418, hygromycin B andchloramphenicol, or otherwise toxic materials such as the herbicidesulfometuron methyl, compactin and copper.

[0182] 2) The V1 and V2 Variable Sequences

[0183] The first and the second polypeptide subunits encoded by V1 andV2, respectively, may be subunits of any multimeric protein. Thesequence of the multimeric protein varies within a library or acollection of multimeric proteins. Example of the multimeric proteinsinclude, but are not limited to antibodies, growth factor receptors, Tcell receptors, cytokine receptors, tyrosine kinase-associatedreceptors, and MHC proteins.

[0184] In preferred embodiment, the multimeric proteins are a library ofantibodies, and more preferably human antibodies. For example, the firstpolypeptide subunit encoded by the library of expression vectors may bea human antibody heavy chain variable region (V_(H)) or a full heavychain including both the variable and constant regions (V_(H)+C_(H),C_(H) including C_(H)1, C_(H)2, and C_(H)3). The second polypeptidesubunit encoded by by the library of expression vectors may be a humanantibody light-chain variable region (V_(L)) or a light chain includingboth the variable and constant region (V_(L)+C_(L)).

[0185] DNA sequences encoding human antibody heavy chain and light chainmay be polynucleotide segments of at least 30 contiguous base pairssubstantially encoding genes of the immunoglobulin superfamily. A. F.Williams and A. N. Barclay (1989) “The Immunoglobulin Gene Superfamily”,in Immunoglobulin Genes, T. Honjo, F. W. Alt, and T. H. Rabbitts, eds.,Academic Press: San Diego, Calif., pp.361-387. The antibody genes aremost frequently encoded by human, non-human primate, avian, porcine,bovine, ovine, goat, or rodent heavy chain and light chain genesequences.

[0186] The library of DNA sequences encoding human antibody heavy chainand light chain may be derived from a variety of sources. For example,mRNA encoding the human antibody libraries may be extracted from cellsor organs from immunized or non-immunized animals or humans. Preferably,organs such as human fetal spleen and lymph nodes may be used.Peripheral blood cells from non-immunized humans may also be used. Theblood samples may be from an individual donor, from multiple donors, orfrom combined blood sources.

[0187] The human antibody coding sequences may be derived and amplifiedby using sets of oligonucleotide primers to amplify the cDNA of humanheavy and light chains by polymerase chain reaction (PCR). Orlandi etal. (1989) Proc. Natl. Acad. Sci. USA 86: 3833-3837. For example, bloodsample may be from healthy volunteers and B-lymphocyte in the blood canbe isolated. RNA can be prepared by following standard procedures.Cathala et al. (1983) DNA 3:329. The cDNA can be made from the isolatedRNA by using reverse transcriptase.

[0188] Alternatively, the antibody coding sequences may be derived froman artificially rearranged immunoglobulin gene or genes. For example,immunoglobulin genes may be rearranged by joining of germ line Vsegments in vitro to J segments, and, in the case of V_(H) domains, Dsegments. The joining of the V, J and D segments may be facilitated byusing PCR primers which have a region of random or specific sequence tointroduce artificial sequence or diversity into the products.

[0189] Optionally, the variable sequences V1 and V2 of the library ofexpression vectors may also be derived from multimeric proteins otherthan antibodies. V1 and V2 may be different subunits of a non-antibodymultimeric protein, such as membrance proteins and cell surfacesreceptor proteins, e.g. insulin receptor, MHC proteins (e.g. class I MHCand class II MHC protein), CD3 receptor, T cell receptors, cytokinereceptors such as interleukin-2 (IL-2) receptor which is made of α, β,and γ subunits, tyrosine-kinase-associated receptors such as Src, Yes,Fgr, Lck, Lyn, Hck, and Blk. The tyrosine-kinase-associated receptorscontain SH2 and SH3 domains which are held there partly by theirinteractions with transmembrane receptor proteins and partly bycovalently attached lipid chains. For example, V1 and V2 sequences maybe mutagenized sequences of the SH2 and SH3 domains of atyrosine-kinase-associated receptor such as Src, respectively, which areincorporated into the expression of vector of the present invention andscreened against various ligands for this receptor.

[0190] A reflection of the power and versatility of the methods of thepresent invention is that the V1 and V2 sequences need not be based inany way on a protein sequence known to bind to the target. Instead, V1and V2 may be from any source and may have a diversity that is entirelyindependent from the target, or one or more lead proteins known to bindto the target.

[0191] 3) The Target Proteins and Peptides

[0192] The target fusion protein may comprise any target protein orpeptide that may be expressed or otherwise present in a host cell. Thetarget protein may be a member of library of proteins or peptides, suchas a collection of human ESTs, a total library of human ESTs, acollection of domain structures (e.g. Zn-finger protein domains), or atotally random peptide library.

[0193] For example, the target protein or peptide may be adisease-associated antigen, such as tumor surface antigen such as B-cellidiotypes, CD20 on malignant B cells, CD33 on leukemic blasts, andHER2/neu on breast cancer. Antibody selected against these antigens canbe used in a wide variety of therapeutic and diagnostic applications,such as treatment of cancer by direct administration of the antibodyitself or the antibody conjugated with a radioisotope or cytotoxic drug,and in a combination therapy involving coadministration of the antibodywith a chemotherapeutic agent, or in conjunction with radiation therapy.

[0194] Alternatively, the target protein may be a growth factorreceptor. Examples of the growth factor include, but are not limited to,epidermal growth factors (EGFs), transferrin, insulin-like growthfactor, transforming growth factors (TGFs), interleukin-1, andinterleukin-2. For example, high expression of EGF receptors have beenfound in a wide variety of human epithelial primary tumors. TGF-α havebeen found to mediate an autocrine stimulation pathway in cancer cells.Several murine monoclonal antibody have been demonstrated to be able tobind EGF receptors, block the binding of ligand to EGF receptors, andinhibit proliferation of a variety of human cancer cell lines in cultureand in xenograft medels. Mendelsohn and Baselga (1995) Antibodies togrowth factors and receptors, in Biologic Therapy of Cancer, 2^(nd) Ed.,J B Lippincott, Philadelphia, pp607-623. Thus, fully human antibodiesselected against these growth factors by using the method of the presentinvention can be used to treat a variety of cancer.

[0195] The target protein may also be cell surface protein or receptorassociated with coronary artery disease such as platelet glycoproteinlib/IIIa receptor, autoimmune diseases such as CD4, CAMPATH-1 and lipidA region of the gram-negative bacterial lipopolysaccharide. Humanizedantibodies against CD4 has been tested in clinical trials in thetreatment of patients with mycosis fungoides, generalized postularpsoriasis, severe psorisis, and rheumatoid arthritis. Antibodies againstlipid A region of the gram-negative bacterial lipopolysaccharide havebeen tested clinically in the treatment of septic shock. Antibodiesagainst CAMPATH-1 has also been tested clinically in the treatment ofagainst refractory rheumatoid arthritis. Thus, fully human antibodiesselected against these growth factors by using the method of the presentinvention can be used to treat a variety of autoimmune diseases. Vaswaniet al. (1998) “Humanized antibodies as potential therapeutic drugs”Annals of Allergy, Asthma and Immunology 81:105-115.

[0196] The target protein or peptide may also be proteins or peptidesassociated with human allergic diseases, such as those inflammatorymediator protein, e.g. Interleukin-1 (IL-1), tumor necrosis factor(TNF), leukotriene receptor and 5-lipoxygenase, and adhesion moleculessuch as V-CAM/VLA-4. In addition, IgE may also serve as the targetantigen because IgE plays pivotal role in type I immediatehypersensitive allergic reactions such as asthma. Studies have shownthat the level of total serum IgE tends to correlate with severity ofdiseases, especially in asthma. Burrows et al. (1989) “Association ofasthma with serum IgE levels and skin-test reactivity to allergens” NewEngl. L. Med. 320:271-277. Thus, fully human antibodies selected againstIgE by using the method of the present invention may be used to reducethe level of IgE or block the binding of IgE to mast cells and basophilsin the treatment of allergic diseases without having substantial impacton normal immune functions.

[0197] The target protein may also be a viral surface or core proteinwhich may serve as an antigen to trigger immune response of the host.Examples of these viral proteins include, but are not limited to,glycoproteins (or surface antigens, e.g., GP120 and GP41) and capsidproteins (or structural proteins, e.g., P24 protein); surface antigensor core proteins of hepatitis A, B, C, D or E virus (e.g. smallhepatitis B surface antigen (SHBsAg) of hepatitis B virus and the coreproteins of hepatitis C virus, NS3, NS4 and NS5 antigens); glycoprotein(G-protein) or the fusion protein (F-protein) of respiratory syncytialvirus (RSV); surface and core proteins of herpes simplex virus HSV-1 andHSV-2 (e.g., glycoprotein D from HSV-2).

[0198] The target protein may also be a mutated tumor suppressor genethat have lost its tumor-suppressing function and may render the cellsmore susceptible to cancer. Tumor suppressor genes are genes thatfunction to inhibit the cell growth and division cycles, thus preventingthe development of neoplasia. Mutions in tumor suppressor genes causethe cell to ignore one or more of the components of the network ofinhibitory signals, overcoming the cell cycle check points and resultingin a higher rate of controlled cell growth—cancer. Examples of the tumorsuppressor genes include, but are not limited to, DPC-4, NF-1, NF-2, RB,p53, WT1, BRCA1 and BRCA2.

[0199] DPC-4 is involved in pancreatic cancer and participates in acytoplasmic pathway that inhibits cell division. NF-1 codes for aprotein that inhibits Ras, a cytoplasmic inhibitory protein. NF-1 isinvolved in neurofibroma and pheochromocytomas of the nervous system andmyeloid leukemia. NF-2 encodes a nuclear protein that is involved inmeningioma, schwanoma, and ependymoma of the nervous system. RB codesfor the pRB protein, a nuclear protein that is a major inhibitor of cellcycle. RB is involved in retinoblastoma as well as bone, bladder, smallcell lung and breast cancer. P53 codes for p53 protein that regulatescell division and can induce apoptosis. Mutation and/or inaction of p53is found in a wide ranges of cancers. WT1 is involved in Wilms tumor ofthe kidneys. BRCA1 is involved in breast and ovarian cancer, and BRCA2is involved in breast cancer. Thus, fully human antibodies selectedagainst a mutated tumor suppressor gene product by using the method ofthe present invention can be used to block the interactions of the geneproduct with other proteins or biochemicals in the pathways of tumoronset and development.

[0200] 2. Construction of the Library of Expression Vectors of thePresent Invention

[0201] The library of expression vectors described above can beconstructed using a variety of recombinant DNA techniques. The presentinvention provides novel and efficient methods of constructing theselibraries of expression vectors with extreme diversity of V1 and V2 invivo and in vitro.

[0202] The methods of the present invention are provided by exploitingthe inherent ability of yeast cells to facilitate homologousrecombination at an extremely high efficiency. The mechanism ofhomologous recombination in yeast and its applications is brieflydescribed below.

[0203] Yeast Saccharomyces cerevisiae has an inherited genetic machineryto carry out efficient homologous recombination in the cell. Thismechanism is believed to benefit the yeast cells for chromosome repairpurpose and traditionally also called gap repair or gap filling. By thismechanism of efficient gap filling, mutations can be introduced intospecific loci of the yeast genome. For example, a vector carrying themutant gene contains two sequence segments that are homologous to the 5′and 3′ open reading frame (ORF) sequences of the gene that is intendedto be interrupted or mutated. The plasmid also contains a positiveselection marker such as a nutritional enzyme allele, such as ura3, oran antibiotic resistant marker such as Geneticine (g418) that areflanked by the two homologous segments. This plasmid is linearized andtransformed into the yeast cells. Through homologous recombinationbetween the plasmid and the yeast genome at the two homologousrecombination sites, a reciprocal exchange of the DNA content occursbetween the wild type gene in the yeast genome and the mutant gene(including the selection marker gene) that are flanked by the twohomologous sequence segments. By selecting for the positive nutritionalmarker, surviving yeast cells will loose the original wild type gene andwill adopt the mutant gene. Pearson B M, Hernando Y, and Schweizer M,(1998) Yeast 14: 391-399. This mechanism has also been used to makesystematic mutations in all 6,000 yeast genes or ORFs for functionalgenomics studies. Because the exchange is reciprocal, similar approachhas been used successfully for cloning yeast genomic fragments intoplasmid vector. Iwasaki T, Shirahige K, Yoshikawa H, and Ogasawara N,Gene 1991, 109 (1): 81-87.

[0204] By using homologous recombination in yeast, gene fragments orsynthetic oligonucleotides can also be cloned into a plasmid vectorwithout a ligation step. In this application, a targeted gene fragmentis usually obtained by PCR amplification (or by using the conventionalrestriction digestion out of an original cloning vector). Two shortfragment sequences that are homologous to the plasmid vector are addedto the 5′ and 3′ of the target gene fragment in the PCR amplification.This can be achieved by using a pair of PCR primers that incorporate theadded sequences. The plasmid vector typically includes a positiveselection marker such as nutritional enzyme allele such as ura3, or anantibiotic resistant marker such as geneticin (g418). The plasmid vectoris linearized by a unique restriction cut in between the sequencehomologies that are shared with the PCR-amplified target, therebycreating an artificial gap at the cleavage site. The linearized plasmidvector and the target gene fragment flanked by sequences homologous tothe plasmid vector are co-transformed into a yeast host strain. Theyeast recognizes the two stretches of sequence homologies between thevector and target fragment, and facilitates a reciprocal exchange of DNAcontents through homologous recombination at the gap. As theconsequence, the target fragment is automatically inserted into thevector without ligation in vitro.

[0205] There are a few factors that may influence the efficiency ofhomologous recombination in yeast. The efficiency of the gap repair iscorrelated with the length of the homologous sequences flanking both thelinearized vector and the targeted gene. Preferably, a minimum of 30base pairs may be required for the length of the homologous sequence,and 80 base pairs may give a near-optimized result. Hua, S. B. et al.(1997) “Minimum length of sequence homology required for in vitrocloning by homologous recombination in yeast” Plasmid 38:91-96. Inaddition, the reciprocal exchange between the vector and gene fragmentis strictly sequence-dependent, i.e. not causing frame shift in thistype of cloning. Therefore, such a unique characteristic of thegap-repair cloning assures insertion of gene fragments with both highefficiency and precision. The high efficiency makes it possible to clonetwo or three targeted gene fragments simultaneously into the same vectorin one transformation attempt. Raymond K., Pownder T. A., and Sexson S.L., (1999) Biotechniques 26: 134-141. The nature of precision sequenceconservation through homologous recombination makes it possible to clonetargeted genes in question into expression or fusion vectors for directfunction examinations. So far many functional or diagnostic applicationshave been reported using homologous recombination. EI-Deiry W. W., etal., Nature Genetics1: 45-49, 1992 (for p53), and Ishioka C., et al.,PNAS, 94: 2449-2453, 1997 (for BRCA1 and APC).

[0206] A library of gene fragments may also be constructed in yeast byusing homologous recombination. For example, a human brain cDNA librarycan be constructed as a two-hybrid fusion library in vector pJG4-5.Guidotti E., and Zervos A. S. (1999) “In vivo construction of cDNAlibrary for use in the yeast two-hybrid systems” Yeast 15:715-720. Ithas been reported that a total of 6,000 pairs of PCR primers were usedfor amplification of 6,000 known yeast ORFs for a study of total yeastgenomic protein interaction. Hudson, J. Jr, et al. (1997) Genome Res.7:1169-1173. Uetz et al. conducted a comprehensive analysis ofprotein-protein interactions in Saccharomyces cerevisiae. Uetz et al.(2000) Nature 403:623-627. The protein-protein interaction map of thebudding yeast was studied by using a comprehensive system to examinetwo-hybrid interactions in all possible combinations between the yeastproteins. Ito et al. (2000) Proc. Natl. Acad. Sci. USA. 97:1143-1147.The genomic protein linkage map of Vaccinia virus was studied byMcCraith S., Holtzman T., Moss B., and Fields, S. (2000) Proc. Natl.Acad. Sci. USA 97: 4879-4884.

[0207] According to the present invention, the V1 and V2 sequences areintroduced into an expression vector by homologous recombinationperformed directly in yeast cells.

[0208] 1) Cloning of V1 and V2 in Separate Fragments into an ExpressionVector through Two Independent Events of Homologous Recombination inYeast

[0209] In one embodiment for the method for generating the library ofexpression vectors, the V1 and V2 sequences may be cloned into anexpression vector in vivo in two separate fragments through twoindependent events of homologous recombination in yeast.

[0210] The method comprises:

[0211] a) transforming into yeast cells i) a linearized yeast expressionvector having a 5′- and 3′-terminus sequence at a first site oflinearization; and ii) a library of first insert nucleotide sequencesthat are linear, double stranded, each of the first insert sequencescomprising a first nucleotide sequence V1 encoding a first polypeptidesubunit, a 5′- and 3′-flanking sequence at the ends of the first insertsequence which are sufficiently homologous to the 5′- and 3′-terminussequences of the vector at the first site of linearization,respectively, to enable homologous recombination to occur;

[0212] b) having homologous recombination occur between the vector andthe first insert sequence in the transformed yeast cells, such that thefirst insert sequence is included in the vector;

[0213] c) isolating from the transformed yeast cells the vectors thatcontain the library of the first insert sequences;

[0214] d) linearizing the vectors containing the library of the firstinsert sequences to generate a 5′- and 3′-terminus sequence at a secondsite of linearization;

[0215] e) transforming into yeast cells

[0216] i) the linearized yeast expression vectors in step d), and

[0217] ii) a library of second insert nucleotide sequences that arelinear, double stranded, each of the second insert sequences comprisinga second nucleotide sequence V2 encoding a second polypeptide subunit, a5′- and 3′-flanking sequence at the ends of the second insert sequencewhich are sufficiently homologous to the 5′- and 3′-terminus sequencesof the vector at the second site of linearization, respectively, toenable homologous recombination to occur; and

[0218] f) having homologous recombination occur between the linearizedyeast expression vector at the second linearization site and the secondinsert sequences in the transformed yeast cells, such that the secondinsert sequence is included in the vector.

[0219] According to the embodiment, the first polypeptide subunit andthe second polypeptide subunit are expressed as separate proteins orpeptides. This may be accomplished by expressing the first and secondpolypeptide subunits from separate promoters, or by expressingbicistronically from the same promoter via an internal ribosomal entrysite (IRES) or via a splicing donor-acceptor mechanism.

[0220] According to the embodiment, the 5′- or 3′-flanking sequence ofthe insert nucleotide sequence is preferably between about 30-120 bp inlength, more preferably between about 40-90 bp in length, and mostpreferably between about 60-80 bp in length.

[0221]FIG. 2 illustrates an embodiment of this method according to thepresent invention. The coding sequences for V1 (e.g., V_(H)+C_(H)1) andV2 (e.g., V_(L)+C_(L)) are carried by separate PCR fragments and clonedinto an expression vector sequentially following two independent eventsof homologous recombination in yeast.

[0222] As illustrated in FIG. 2, the V1 fragment has a 5′ flankingsequence and a 3′ flanking sequence that are homologous to the 5′ and 3′terminus of a linearized expression vector, respectively. When the V1fragment and the linearized expression vector are introduced into a hostcell, for example, transformed into a yeast cell, the “gap” (the firstlinearization site) created by linearization of the expression vector isfilled by the V1 fragment insert through recombination of the homologoussequences at the 5′ and 3′ terminus of these two linear double-strandedDNA. Through this event of homologous recombination, a library ofcircular vectors carrying the variable sequence V1 is generated.

[0223] This library of circular vectors is then cleaved at a secondlinearization site, for example, a site downstream of V1. The V2fragment has a 5′ flanking sequence and a 3′ flanking sequence that arehomologous to the 5′ and 3′ terminus of the linearized expression vectorat the second linearization site. The V2 fragment and the linearizedexpression vector are transformed into a yeast cell. Through a secondevent of homologous recombination, the V2 fragment is inserted into thelinearized expression vector at the second linearization site. As aresult, a library of circular vectors carrying the variable sequences V1and V2 is generated.

[0224] Each flanking sequence added to the V1 and V2 coding sequence maybe preferably between about Each flanking sequence added to the 5′ and3′-terminus of V2 sequence is preferably between about 30-120 bp inlength, more preferably between about 40-90 bp in length, and mostpreferably between about 45-55 bp in length.

[0225] When the V1 and V2 coding sequences are inserted into anexpression vector containing an AD domain, it is preferred that thereading frames of the V1 or V2 fragments are conserved with upstream ADreading frame.

[0226] Depending on the cloning expression vector used, additionalfeatures such as affinity tags and unique restriction enzyme recognitionsites may be added to the expression for the convenience of detectionand purification of the inserted V1 and V2 sequences. Examples ofaffinity tags include, but are not limited to, a polyhistidine tract,polyarginine, glutathione-S-transferase (GST), maltose binding protein(MBP), a portion of staphylococcal protein A (SPA), and variousimmunoaffinity tags (e.g. protein A) and epitope tags such as thoserecognized by the EE (Glu-Glu) antipeptide antibodies.

[0227] In a preferred embodiment, the V1 and V2 sequences may be thecoding sequences for a heavy chain and a light-chain, respectively,which are derived from a human antibody repertoire. To generate the V1and V2 coding sequences from the human antibody repertoire, a complexhuman antibody cDNA gene pool may generated by using the methods knownin the art. Sambrook, J., et al. (1989) Molecular Cloning: a laboratorymanual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; andAusubel, F. M. et al. (1995) Current Protocols in Molecular Biology”John Wiley & Sons, NY.

[0228] Total RNA may be isolated from sources such as the white cells(mainly B cells) contained in peripheral blood supplied by un-immunizedhumans, or from human fetal spleen and lymph nodes. First strand cDNAsynthesis may be synthesized performed by using methods known in theart, such as those described by Marks et al. Marks et al. (1991) Eur. J.Immunol. 21:985-991.

[0229] Specifically, a mixture of heavy and light chain cDNA primer setsdesigned to anneal to the constant regions may be used for priming thesynthesis of cDNA of heavy chain and light chains (both kappa Vκ andlambda Vλ) antibody genes. Examples of how to generate the cDNA libraryof human antibody genes are illustrated in Example 1.

[0230] The coding sequences of human heavy and light chain genes may beamplified from the antibody cDNA library generated above by using PCRprimer sets used in combination to prime the heavy chain variable region(V_(H)), the full heavy chain including both the variable and constantregions (V_(H)+C_(H), C_(H) including C_(H)1, C_(H)2, and C_(H)3), thelight chain variable region (V_(L), including Vλ and Vκ) and the fulllight chain including both the variable and constant region(V_(L)+C_(L)). The each of the PCR primers may include both an antibodypartial sequence and a 5′ or 3′ flanking sequence for facilitatinghomologous recombination between the antibody fragments and a cloningexpression vector. Examples of these primers are listed in Table 2.

[0231] 2) Cloning of V1 (or V2) into an Expression Vector in BacteriaFollowed by Cloning V2 (or V1) into the Vector via HomologousRecombination in Yeast

[0232] In another embodiment of the method for generating the library ofexpression vectors, the V1 (or V2) sequences are cloned into anyeast-bacteria shuttle vector such as a modified vector derived frompACT2 (supplied by Clontech, Palo Alto, Calif.) in bacteria. The V2 (orV1) sequences are then inserted into the library of expression vectorcomprising V1 (or V2) via homologous recombination in yeast.

[0233] In one embodiment, the method comprises: transforming into yeastcells a library of insert nucleotide sequences that are linear anddouble-stranded, and a library of linearized yeast expression vectors,each having a 5′- and 3′-terminus sequence at the site of linearization.

[0234] The linearized yeast expression vectors of the vector librarycomprise a first polynucleotide sequence V1 encoding a first polypeptidesubunit and varying within the vector library. The insert sequences ofthe insert library comprise a second nucleotide sequence V2 encoding asecond polypeptide subunit and varying within the insert library. Eachof the insert sequences also comprises a 5′- and 3′-flanking sequence atthe ends of the insert sequence. The 5′- and 3′-flanking sequence of theinsert sequence are sufficiently homologous to the 5′- and 3′-terminussequences of the linearized yeast expression vector, respectively, toenable homologous recombination to occur.

[0235] In this embodiment, the first polypeptide subunit and the secondpolypeptide subunit are expressed as a single fusion protein. Also, thefirst and second nucleotide sequences each independently varies withinthe library of expression vectors.

[0236] According to the embodiment, the 5′- or 3′-flanking sequence ofthe insert nucleotide sequence is preferably between about 30-120 bp inlength, more preferably between about 40-90 bp in length, and mostpreferably between about 60-80 bp in length.

[0237]FIG. 3 illustrates an embodiment of this method according to thepresent invention. The coding sequences for V1 (e.g., antibody heavychain or light chain) are amplified by PCR to generate separatefragments which are directionally cloned into an expression vector inbacteria, resulting a library of expression vectors. The V2 inserts arethen cloned into this library of expression vectors through homologousrecombination in yeast. The detailed procedures are described in Example1.

[0238] As illustrated in FIG. 3, the V1 fragment has a restriction siteits 5′ terminus that matches with a restriction site at the 5′ terminusof a linearized expression vector, and a restriction site its 3′terminus that matches with a restriction site at the 3′ terminus of thelinearized expression vector. By using a method of directional cloning,the V1 fragments are ligated into the expression vectors to generate alibrary of vectors encoding V1. The resulting library of closed circularvectors are transformed into and propagated in bacteria.

[0239] The V1-encoding vector library is dthen cleaved at a secondlinearization site, for example, a site downstream of V1. The V2fragment has a 5′ flanking sequence and a 3′ flanking sequence that arehomologous to the 5′ and 3′ terminus of the linearized expression vectorat the second linearization site. The V2 fragment and the linearizedexpression vector are transformed into a yeast cell. Through homologousrecombination in yeast, the V2 fragment is inserted into the linearizedexpression vector at the second linearization site. As a result, alibrary of circular vectors carrying the variable sequences V1 and V2 isgenerated.

[0240] Each flanking sequence added to the 5′ and 3′-terminus of V2sequence is preferably between about 30-120 bp in length, morepreferably between about 40-90 bp in length, and most preferably betweenabout 45-55 bp in length.

[0241] By using similar methods as described above, the variablesequences V1 and V2 can be inserted into an expression vector containingan activation domain (AD) or a DNA-binding domain (BD) of atranscription activator. The AD or BD domain may be positioned upstreamor downstream of V1 (or V2). It is preferred that the reading frames ofthe V1 (or V2) fragments are conserved with the AD or BD reading frame.

[0242] The expression vector containing an AD (or BD) domain may be anyvector engineered to carry the coding sequence of the AD domain. Theexpression vector is preferably a yeast vector such as PGAD10 (Feiloteret al. (1994) “Construction of an improved host strain for two hybridscreening” Nucleic Acids Res. 22: 1502-1503), pACT2 (Harper et al (1993)“The p21 Cdk-interacting protein Cip1 is a protein inhibitor of G1cyclin-dependent kinase” Cell 75:805-816), and pGADT7 (“Matchmaker Gal4two hybrid system 3 and libraries user manual” (1999), ClontechPT3247-1, supplied by Clontech, Palo Alto, Calif.).

[0243] The expression vector containing an AD (or BD) domain may alsoinclude another expression unit which is capable of expressing thesecond polypeptide subunit encoded by V2.

[0244] Expression of V1 and/or V2 may be separately under thetranscriptional control of a constitutive promoter or an induciblepromoter. One example of such an expression vector is available fromClontech, pBridge® (catalog No. 6184-1). The expression vector,pBridge®, contains one expression unit that controls expression of a Gal4 BD domain and another expression unit that includes an induciblepromoter Pmat25. Tirode, E. et al. (1997) J. Biol. Chem.272:22995-22999.

[0245] The linearized vector DNA may be mixed with equal or excessamount of the V1 or V2 inserts. The linearized vector DNA and theinserts are co-transformed into host cells, such as competent yeastcells. Recombinant clones may be selected based on survival of cells ina nutritional selection medium or based on other phenotypic markers.Either the linearized vector or the insert alone may be used as acontrol for determining the efficiency of recombination andtransformation.

[0246] Other homologous recombination systems may be used to generatethe library of expression vectors of the present invention. For example,the recombination between the library of V1 or V2 sequences and therecipient expression vector may be facilitated by site-specificrecombination.

[0247] The site-specific recombination employs a site-specificrecombinase, an enzyme which catalyzes the exchange of DNA segments atspecific recombination sites. Site-specific recombinases present in someviruses and bacteria, and have been characterized to have bothendonuclease and ligase properties. These recombinases, along withassociated proteins in some cases, recognize specific sequences of basesin DNA and exchange the DNA segments flanking those segments. Landy, A.(1993) Current Opinion in Biotechnology 3:699-707.

[0248] A typical site-specific recombinase is CRE recombinase. CRE is a38-kDa product of the cre (cyclization recombination) gene ofbacteriophage P1 and is a site-specific DNA recombinase of the Intfamily. Sternberg, N. et al. (1986) J. Mol. Biol. 187: 197-212. CRErecognizes a 34-bp site on the P1 genome called loxP (locus of X-over ofP1) and efficiently catalyzes reciprocal conservative DNA recombinationbetween pairs of loxP sites. The loxP site [SEQ ID NO: 1] consists oftwo 13-bp inverted repeats flanking an 8-bp nonpalindromic core region.CRE-mediated recombination between two directly repeated loxP sitesresults in excision of DNA between them as a covalently closed circle.Cre-mediated recombination between pairs of loxP sites in invertedorientation will result in inversion of the intervening DNA rather thanexcision. Breaking and joining of DNA is confined to discrete positionswithin the core region and proceeds on strand at a time by way oftransient phophotyrosine DNA-protein linkage with the enzyme.

[0249] The CRE recombinase also recognizes a number of variant or mutantlox sites relative to the loxP sequence. Examples of these Crerecombination sites include, but are not limited to, the loxB, loxL andloxR sites which are found in the E. coli chromosome. Hoess et al.(1986) Nucleic Acid Res. 14:2287-2300. Other variant lox sites include,but are not limited to, loxB, loxL, loxR, loxP3, loxP23, loxΔ86,loxΔ117, loxP511 [SEQ ID NO:2], and loxC2 [SEQ ID NO:3]. Table 1 listsexamples of lox sites that may be used in the present invention,including wild-type loxP sites LoxP WT [SEQ ID NO: 1] and loxP2 [SEQ IDNO: 5], and other loxP variants with mutations in the 13-bp invertedrepeats region and/or the 8-bp nonpalindromic core region (underlined),loxP511 [SEQ ID NO: 2], loxC2 [SEQ ID NO: 3], loxP1 [SEQ ID NO: 4],loxP3 [SEQ ID NO: 6], loxP4 [SEQ ID NO: 7], loxP5 [SEQ ID NO: 8], loxP6[SEQ ID NO: 9], loxP7 [SEQ ID NO: 10], loxP8 [SEQ ID NO: 11], loxP9 [SEQID NO: 12], and loxP10 [SEQ ID NO: 13].

[0250] Examples of the non-CRE recombinases include, but are not limitedto, site-specific recombinases include: att sites recognized by the Intrecombinase of bacteriophage λ (e.g. att1, att2, att3, attP, attB, attL,and attR), the FRT sites recognized by FLP recombinase of the 2piplasmid of Saccharomyces cerevisiae, the recombination sites recognizedby the resolvase family, and the recombination site recognized bytransposase of Bacillus thruingiensis.

[0251] Subsequent analysis may also be carried out to determine theefficiency of homologous recombination that results in correct insertionof the V1 and V2 sequences into the expression vector. For example, PCRamplification of the V1 or V2 inserts directly from the selected yeastclone may reveal how many clones are recombinant. Libraries with minimumof 90% recombinant clones are preferred. The same PCR amplification ofselected clones may also reveal the insert size. Although a smallfraction of the library may contain double or triple inserts, themajority (>90%) is preferably to have a single insert with the expectedsize.

[0252] To verify sequence diversity of the inserts in the selectedclones, PCR amplification product with the correct size of insert may befingerprinted with frequent digesting restriction enzymes. From a gelelectrophoresis pattern, it may be determined whether the clonesanalyzed are of the same identity or of the distinct or diversifiedidentity. The PCR products may also be sequenced directly to reveal theidentity of inserts and the fidelity of the cloning procedure and toprove the independence and diversity of the clones.

[0253] In an embodiment where the V1 and V2 sequences are the codingsequences for a heavy chain and a light chain derived from a humanantibody repertoire, respectively, monoclonal antibody may be generatedfrom hybridoma cell lines as controls by following the same proceduresdescribed above. Examples of hybridoma cell lines include, but are notlimit to, anti-GFP antibody producing cell line (Clontech), anti-p53antibodies producing cell lines (NeoMarker), and other hybridoma celllines available from ATCC (Atlanta). The hybridoma cell line issubjected to the same procedures described above, i.e., RNA isolation,cDNA synthesis, PCR amplification, and homologous recombination intoyeast. Other antibody libraries may also be generated from mouse fetalliver and fetal spleen using the same principle.

[0254] The mouse antibody library generated can provide a direct controlfor existing individual mouse monoclonal antibody with its cognateantigen. Most studies for antigen-antibody interaction have beenperformed with mouse antibodies. The mouse antibody library should serveas an excellent control in the selection of human antibody libraryagainst a target antigen by yeast two-hybrid method described below.

[0255] 3. Selection of Affinity Binding Pairs between the Library ofFusion Proteins of the Present Invention and Target Proteins

[0256] The present invention also provides methods for screeningprotein-protein or protein-peptide binding pairs in a yeast two-hybridsystem.

[0257] The two-hybrid system is a selection scheme designed to screenfor polypeptide sequences which bind to a predetermined polypeptidesequence present in a fusion protein. Chien et al. (1991) Proc. Natl.Acad. Sci. (USA) 88: 9578). This approach identifies protein-proteininteractions in vivo through reconstitution of a transcriptionalactivator. Fields and Song (1989) Nature 340: 245), the yeast Gal 4transcription protein. The method is based on the properties of theyeast Gal 4 protein, which consists of separable domains responsible forDNA-binding and transcriptional activation. Polynucleotides encoding twohybrid proteins, one consisting of the yeast Gal 4 DNA-binding domain(BD) fused to a polypeptide sequence of a known protein and the otherconsisting of the Gal4 activation domain (AD) fused to a polypeptidesequence of a second protein, are constructed and introduced into ayeast host cell. Intermolecular binding between the two fusion proteinsreconstitutes the Gal4 DNA-binding domain with the Gal4 activationdomain, which leads to the transcriptional activation of a reporter gene(e.g., lacZ, HIS3) which is operably linked to a Gal4 binding site.

[0258] Typically, the two-hybrid method is used to identify novelpolypeptide sequences which interact with a known protein. Silver andHunt (1993) Mol. Biol. Rep. 17: 155; Durfee et al. (1993) Genes Devel.7; 555; Yang et al. (1992) Science 257: 680; Luban et al. (1993) Cell73: 1067; Hardy et al. (1992) Genes Devel. 6; 801; Bartel et al. (1993)Biotechniques 14: 920; and Vojtek et al. (1993) Cell 74: 205. Thetwo-hybrid system was used to detect interactions between three specificsingle-chain variable fragments (scFv) and a specific antigen. De Jaegeret al. (2000) FEBS Lett. 467:316-320. The two-hybrid system was alsoused to screen against cell surface proteins or receptors such asreceptors of hematopoietic super family in yeast. Ozenberger, B. A., andYoung, K. H. (1995) “Functional interaction of ligands and receptors ofhematopoietic superfamily in yeast” Mol Endocrinol. 9:1321-1329.

[0259] Variations of the two-hybrid method have been used to identifymutations of a known protein that affect its binding to a second knownprotein Li and Fields (1993) FASEB J. 7: 957; Lalo et al. (1993) Proc.Natl. Acad. Sci. (USA) 90: 5524; Jackson et al. (1993) Mol. Cell. Biol.13; 2899; and Madura et al. (1993) J. Biol. Chem. 268: 12046.

[0260] Two-hybrid systems have also been used to identify interactingstructural domains of two known proteins or domains responsible foroligomerization of a single protein. Bardwell et al. (1993) Med.Microbiol. 8: 1177; Chakraborty et al. (1992) J. Biol. Chem. 267: 17498;Staudinger et al. (1993) J. Biol. Chem. 268: 4608; and Milne G T; WeaverD T (1993) Genes Devel. 7; 1755; Iwabuchi et al. (1993) Oncogene 8;1693; Bogerd et al. (1993) J. Virol. 67: 5030).

[0261] Variations of two-hybrid systems have been used to study the invivo activity of a proteolytic enzyme. Dasmahapatra et al. (1992) Proc.Natl. Acad. Sci. (USA) 89: 4159. Alternatively, an E. coli/BCCPinteractive screening system was used to identify interacting proteinsequences (i.e., protein sequences which heterodimerize or form higherorder heteromultimers). Germino et al. (1993) Proc. Natl. Acad. Sci.(U.S.A.) 90: 933; and Guarente L (1993) Proc. Natl. Acad. Sci. (U.S.A.)90: 1639.

[0262] Typically, selection of binding protein using a two-hybrid methodrelies upon a positive association between two Gal4 fusion proteins,thereby reconstituting a functional Gal4 transcriptional activator whichthen induces transcription of a reporter gene operably linked to a Gal4binding site. Transcription of the reporter gene produces a positivereadout, typically manifested either (1) as an enzyme activity (e.g.,β-galactosidase) that can be identified by a calorimetric enzyme assayor (2) as enhanced cell growth on a defined medium (e.g., HIS3 and Ade2). Thus, the method is suited for identifying a positive interaction ofpolypeptide sequences, such as antibody-antigen interactions.

[0263] False positives clones that indicate activation of the reportergene irrespective of the specific interaction between the two hybridproteins, may arise in the two-hybrid screening. Various procedures havedeveloped to reduce and eliminate the false positive clones from thefinal positives. For example, 1) prescreening the clones that containsthe target vector and shows positive in the absence of the two-hybridpartner (Bartel, P. L., et al. (1993) “Elimination of false positivesthat arise in using the two-hybrid system” BioTechniques 14:920-924); 2)by using multiple reporters such as His3, β-galactosidase, and Ade2(James, P. et al. (1996) “Genomic libraries and a host strain designedfor highly efficient two-hybrid selection in yeast” Genetics144:1425-1436); 3) by using multiple reporters each of which is underdifferent GAL 4—responsive promoters such as those in yeast strain Y190where each of the His 3 and β-Gal reporters is under the control of adifferent promoter Gal 1 or Gal 10, but both response to Gal 4 signaling(Durfee, T., et al (1993) “The retinoblastoma protein associates withthe protein phosphatase type 1 catalytic subunit” Genes Devel.7:555-569); and 4) by post-screening assays such as testing isolateswith target consisting of GAL 4-BD alone.

[0264] In addition, the false positive clones may also be eliminated byusing unrelated targets to confirm specificity. This is a standardcontrol procedure in the two-hybrid system which can be performed afterthe library isolate is confirmed by the above-described 1)-4)procedures. Typically, the library clones are confirmed byco-transforming the initially isolated library clones back into theyeast reporter strain with one or more control targets unrelated to thetarget used in the original screening. Selection is conducted toeliminate those library clones that show positive activation of thereporter gene and thus indicate non-specfic interactions with multiple,related proteins.

[0265] The present invention provides efficient methods for screeningthe polypeptide encoded by V1 and V2 in the library of expressionvectors for their affinity binding to one or more target proteins.

[0266] According to the present invention, the method comprises:

[0267] expressing a library of tester protein complexes in yeast cells,each tester protein complexes being formed between a first polypeptidesubunit whose sequence varies within the library, and a secondpolypeptide subunit whose sequence varies within the libraryindependently of the first polypeptide; expressing one or more targetfusion proteins in the yeast cells expressing the tester proteins, eachof the target fusion proteins comprising a target peptide or protein;and

[0268] selecting those yeast cells in which a reporter gene isexpressed, the expression of the reporter gene being activated bybinding of the tester protein complex to the target fusion protein.

[0269] According to the method, the diversity of the first or the secondpolypeptide subunit is preferably between 10³-10⁸, more preferablybetween 10⁴-10⁸ and most preferably between 10⁵-10⁸.

[0270] Also according to the method, the diversity of the proteincomplexes encoded by the library of expression vectors is preferablybetween 10⁶-10¹⁸, more preferably between 10⁹-10¹⁸, and most preferablybetween 10¹⁰-10¹⁸.

[0271] A feature of the present invention is that the first and secondpolypeptide subunits may be selected entirely independent of the targetpeptide or protein and need not be based on in any way upon one or moreproteins known to the bind to the target. As a result, the diversitiesof the first and second polypeptide subunits may be each independentlyderived from libraries of precursor sequences that are not specificallydesigned for the target peptide or protein. For example, the librariesof precursor sequences need not be derived from a small group (e.g.2-20) of genes with predetermined sequences and encoding proteins thatare known to the bind the target peptide or protein.

[0272] The diversities of the first and second polypeptide subunits alsoneed not be derived from one or more proteins that are known to bind tothe target peptide or protein. For example, the one or more proteinsneed not be derived from a small group (e.g. 2-20) of proteins withpredetermined sequences that are known to bind to the target peptide orprotein.

[0273] The diversities of the first and second polypeptide subunits alsoneed not be generated by mutagenizing one or more proteins that areknown to bind to the target peptide or protein. For example, the firstand second polypeptide subunits need not be generated by mutagenizing asmall group (e.g. 2-20) of proteins with predetermined sequences andknown to bind to the target peptide or protein.

[0274] In a variation of the embodiment, a single target fusion proteinis expressed and screened against the library of tester proteins.According to the variation, the step of expressing the library of testerprotein complexes may include transforming a library of testerexpression vectors into the yeast cells which contain a reporterconstruct comprising the reporter gene whose expression is undertranscriptional control of a transcription activator comprising anactivation domain and a DNA binding domain.

[0275] Each of the tester expression vectors comprises a firsttranscription sequence encoding either the activation domain or the DNAbinding domain of the transcription activator, a first nucleotidesequence encoding the first polypeptide subunit, and a second nucleotidesequence encoding the second polypeptide subunit, the first and secondnucleotide sequences varying independently within the library of testerexpression vectors. The domain encoded by the first transcriptionsequence and the first polypeptide subunit are expressed as a fusionprotein. The first and second polypeptide subunits are expressed asseparate proteins, and form the tester protein complex upon binding witheach other through non-covalent interactions (e.g. hydrophobicinteractions) or covalent interactions (e.g. disulfide bonds).

[0276] Optionally, the step of expressing the target protein complexesincludes transforming a target expression vector into the yeast cellssimultaneously or sequentially with the library of tester expressionvectors. The target expression vector comprises a second transcriptionsequence encoding either the activation domain AD or the DNA bindingdomain BD of the transcription activator which is not expressed by thelibrary of tester expression vectors; and a target sequence encoding thetarget protein or peptide.

[0277]FIG. 4 illustrates a flow diagram of a preferred embodiment of theabove described method. As illustrated in FIG. 4, the sequence librarycontaining V1 fused with an AD domain upstream and V2 is carried by alibrary of expression vectors, the AD-V1/V2 vectors. The coding sequenceof the target protein (labeled as “Target”) is contained in anotherexpression vector and fused with a BD domain, forming the BD-Targetvector.

[0278] The AD-V1/V2 vector and the BD-Target vector may beco-transformed into a yeast cell by using method known in the art.Gietz, D. et al. (1992) “Improved method for high efficiencytransformation of intact yeast cells” Nucleic Acids Res. 20:1425. Theconstruct carrying the specific DNA binding site and the reporter gene(labeled as “Reporter”) may be stably integrated into the genome of thehost cell or transiently transformed into the host cell. Upon expressionof the sequences in the expression vectors, the library of proteincomplexes comprising AD-V1 fusion and V2, labeled as the AD-V1/V2protein complexes, undergo protein folding in the host cell and adoptvarious conformations. Some of the AD-V1/V2 protein complexes may bindto the Target protein expressed by the BD-Target vector in the hostcell, thereby bringing the AD and BD domains to a close proximity in thepromoter region (i.e., the specific DNA binding site) of the reporterconstruct and thus reconstituting a functional transcription activatorcomposed of the AD and BD domains. As a result, the AD activates thetranscription of the reporter gene downstream from the specific DNAbinding site, resulting in expression of the reporter gene, such as thelacZ reporter gene. Clones showing the phenotype of the reporter geneexpression are selected, and the AD-V1/V2 vectors are isolated. Thecoding sequences for V1 and V2 are identified and characterized.

[0279] Alternatively, the steps of expressing the library of testerprotein complexes and expressing the target fusion protein includescausing mating between first and second populations of haploid yeastcells of opposite mating types.

[0280] The first population of haploid yeast cells comprises a libraryof tester expression vectors for the library of tester proteincomplexes. Each of the tester expression vector comprises a firsttranscription sequence encoding either the activation domain AD or theDNA binding domain BD of the transcription activator, a first nucleotidesequence V1 encoding the first polypeptide subunit, and a secondnucleotide sequence V2 encoding the second polypeptide subunit.

[0281] The second population of haploid yeast cells comprises a targetexpression vector. The target expression vector comprises a secondtranscription sequence encoding either the activation domain AD or theDNA binding domain BD of the transcription activator which is notexpressed by the library of tester expression vectors; and a targetsequence encoding the target protein or peptide. Either the first orsecond population of haploid yeast cells comprises a reporter constructcomprising the reporter gene whose expression is under transcriptionalcontrol of the transcription activator.

[0282] In this method, the haploid yeast cells of opposite mating typesmay preferably be α and a type strains of yeast. The mating between thefirst and second populations of haploid yeast cells of α and a-typestrains may be conducted in a rich nutritional culture medium.

[0283]FIG. 5 illustrates a flow diagram of a preferred embodiment of theabove described method. As illustrated in FIG. 5, the sequence librarycontaining V1 fused with an AD domain upstream and V2 is carried by alibrary of expression vectors, the AD-V1/V2 vectors. The library of theAD-V1/V2 vectors are transformed into haploid yeast cells such as the atype strain of yeast.

[0284] The coding sequence of the target protein (labeled as “Target”)is contained in another expression vector and fused with a BD domain,forming the BD-Target vector. The BD-Target vector is transformed intohaploid cells of opposite mating type of the haploid cells containingthe AD-V1/V2 vectors, such as the α type strain of yeast. The constructcarrying the specific DNA binding site and the reporter gene (labeled as“Reporter”) may be transformed into the haploid cells of either the typea or type α strain of yeast.

[0285] The haploid cells of the type a and type α strains of yeast aremated under suitable conditions such as low speed of shaking in liquidculture, physical contact in solid medium culture, and rich medium suchas YPD. Bendixen, C. et al. (1994) “A yeast mating-selection scheme fordetection of protein-protein interactions”, Nucleic Acids Res. 22:1778-1779. Finley, Jr., R. L. & Brent, R. (1994) “Interaction matingreveals lineary and ternery connections between Drosophila cell cycleregulators”, Proc. Natl. Acad. Sci. USA, 91:12980-12984. As a result,the AD-V1/V2, the BD-Target expression vectors and the Reporterconstruct are taken into the parental diploid cells of the a and type αstrain of haploid yeast cells.

[0286] Upon expression of the sequences in the expression vectors in theparental diploid cells, the library of protein complexs formed betweenAD-V1 fusion and V2, labeled as the AD-V1/V2 protein complexes, undergoprotein folding in the host cell and adopt various conformations. Someof the AD-V1/V2 protein complexes may bind to the Target proteinexpressed by the BD-Target vector in the parental diploid cell, therebybringing the AD and BD domains to a close proximity in the promoterregion (i.e., the specific DNA binding site) of the reporter constructand thus reconstituting a functional transcription activator composed ofthe AD and BD domains. As a result, the AD activates the transcriptionof the reporter gene downstream from the specific DNA binding site,resulting in expression of the reporter gene, such as the lacZ reportergene. Clones showing the phenotype of the reporter gene expression areselected, and the AD-V1/V2 vectors are isolated. The coding sequencesfor V1 and V2 are identified and characterized.

[0287] A wide variety of reporter genes may be used in the presentinvention. Examples of proteins encoded by reporter genes include, butare not limited to, easily assayed enzymes such as β-galactosidase,α-galactosidase, luciferase, β-glucuronidase, chloramphenicol acetyltransferase (CAT), secreted embryonic alkaline phosphatase (SEAP),fluorescent proteins such as green fluorescent protein (GFP), enhancedblue fluorescent protein (EBFP), enhanced yellow fluorescent protein(EYFP) and enhanced cyan fluorescent protein (ECFP); and proteins forwhich immunoassays are readily available such as hormones and cytokines.The expression of these reporter genes can also be monitored bymeasuring levels of mRNA transcribed from these genes.

[0288] When the screening of the V1 and V2 library is conducted in yeastcells, certain reporter(s) are of nutritional reporter which allows theyeast to grow on the specific selection medium plate. This is a verypowerful screening process, as has been shown by many published papers.Examples of the nutritional reporter include, but are not limited to,His3, Ade2, Leu2, Ura3, Trp1 and Lys2. The His3 reporter is described inBartel, P. L. et al. (1993) “Using the two-hybrid system to detectprotein-protein interactions”, in Cellular interactions in Development:A practical approach, ed. Hastley, D. A., Oxford Press, pages 153-179.The Ade2 reporter is described in Jarves, P. et al. (1996) “Genomiclibraries and a host strain designed for highly efficient two-hybridselection in yeast” Genetics 144:1425-1436.

[0289] For example, a library of antibody expression vectors may betransformed into haploid cells of the α mating type of yeast strain. Theantibody expression vector may contain an antibody light chain fusedwith an AD domain of GAL 4 transcription activator and an antibody heavychain expressed from a separate expression cassette in the vector. A BDdomain of GAL 4 transcription activator is fused with the sequenceencoding the target protein to be selected against the antibody libraryin a plasmid. This plasmid is transformed into haploid cells of the amating type of yeast strain.

[0290] Equal volume of AD-Antibody library-containing yeast stain(α-type) and the BD-target-containing yeast strain (a-type) areinoculated into selection liquid medium and incubated separately first.These two cultures are then mixed and allowed to grow in rich mediumsuch as 1×YPD and 2×YPD. Under the rich nutritional culture condition,the two haploid yeast strains will mate and form diploid cells. At theend of this mating process, these yeast cells are plated into selectionplates. A multiple-marker selection scheme may be used to select yeastclones that show positive interaction between the antibodies in thelibrary and the target. For example, a scheme of SD/-Leu-Trp-His-Ade maybe used. The first two selections (Leu-Trp) are for markers (Leu andTrp) expressed from the AD-Antibody library and the BD-Target vector,respectively. Through this dual-marker selection, diploid cellsretaining both BD and AD vectors in the same yeast cells are selected.The latter two markers, His-Ade, are used to screen for those clonesthat express the reporter gene from parental strain, presumably due toaffinity binding between the antibodies in the library and the target.

[0291] After the screening by co-transformation, or by mating screeningas described above, the putative interaction between the gene probe andthe library clone isolates can be further tested and confirmed in vitroor in vivo.

[0292] In vitro binding assays may be used to confirm the positiveinteraction between the tested protein expressed by the clone isolateand the target protein or peptide. For example, the in vitro bindingassay may be a “pull-down” method, such as using GST (glutathioneS-transferase)-fused gene probe as matrix-binding protein, and with invitro expressed library clone isolate that are labeled with aradioactive or non-radioactive group. While the probe is bound to thematrix through GST affinity substrate (glutathione-agarose), the libraryclone isolate will also bind to the matrix through its affinity with thegene probe. The in vitro binding assay may also be aco-immuno-precipitation (Co-IP) method using two affinity tagantibodies. In this assay, both the target gene probe and the libraryclone isolate are in vitro expressed fused with peptide tags, such as HA(haemaglutinin A) or Myc tags. The gene probe is firstimmuno-precipitated with an antibody against the affinity peptide tag(such as HA) that the target gene probe is fused with. Then the secondantibody against a different affinity tag (such as Myc) that is fusedwith the library clone isolate is used for reprobing the precipitate.

[0293] In vivo assays may also be used to confirm the positiveinteraction between the tested protein expressed by the clone isolateand the target protein or peptide. For example, a mammalian two-hybridsystem may serve as a reliable verification system for the yeasttwo-hybrid library screening. In this system, the target gene probe andthe library clone are fused with Gal 4 DNA-binding domain or a mammalianactivation domain (such as VP-16) respectively. These two fusionproteins under control of a strong and constitutive mammalian promoter(such as CMV promoter) are introduced into mammalian cells bytransfection along with a reporter responsive to Gal 4. The reporter canbe CAT gene (chloramphenical acetate transferase) or other commonly usedreporters. After 2-3 days of transfection, CAT assay or other standardassays will be performed to measure the strength of the reporter whichis correlated with the strength of interaction between the gene probeand the library clone isolate.

[0294] The present invention also provides a kit for selecting selectingtester proteins capable of binding to a target peptide or protein.

[0295] In an embodiment, the kit comprises: a library of testerexpression vectors and a yeast cell line. Each of the tester expressionvectors comprises a first transcription sequence encoding either anactivation domain or a DNA binding domain of a transcription activator,a first nucleotide sequence encoding a first polypeptide subunit, and asecond nucleotide sequence encoding a second polypeptide subunit, thefirst and second nucleotide sequences each independently varying withinthe library of expression vectors. The first and second polypeptidesubunits are expressed as separate proteins and form a protein complexupon interacting with each other. A reporter construct may be containedin the yeast cell line. The reporter construct comprises a reporter genewhose expression is under a transcriptional control of a specific DNAbinding site.

[0296] Optionally, the kit may further comprise a target expressionvector which comprises a second transcription sequence encoding eitherthe activation domain or the DNA binding domain of the transcriptionactivator which is not expressed by the library of tester expressionvectors; and a target sequence encoding the target protein or peptide.

[0297] In another embodiment, the kit comprises: a first and secondpopulations of haploid yeast cells of opposite mating types. The firstpopulation of haploid yeast cells comprises a library of testerexpression vectors for the library of tester fusion proteins. Each ofthe tester expression vector comprises a first transcription sequenceencoding either an activation domain or a DNA binding domain of atranscription activator, a first nucleotide sequence encoding a firstpolypeptide subunit, and a second nucleotide sequence encoding a secondpolypeptide subunit, the first and second nucleotide sequences eachindependently varying within the library of expression vectors. Thefirst and second polypeptide subunits are expressed as separate proteinsand form a protein complex upon interacting with each other. The secondpopulation of haploid yeast cells comprises a target expression vector.The target expression vector encodes either the activation domain or theDNA binding domain of the transcription activator which is not expressedby the library of tester expression vectors; and a target sequenceencoding the target protein or peptide. Either the first or secondpopulation of haploid yeast cells comprises a reporter constructcomprising a reporter gene whose expression is under transcriptionalcontrol of the transcription activator.

[0298] Optionally, the second population of haploid yeast cellscomprises a plurality of target expression vectors. Each of the targetexpression vectors encodes either the activation domain or the DNAbinding domain of the transcription activator which is not expressed bythe library of tester expression vectors; and a target sequence encodingthe target protein or peptide. Either the first or second population ofhaploid yeast cells comprises a reporter construct comprising a reportergene whose expression is under transcriptional control of thetranscription activator.

[0299] According to the present invention, other yeast two-hybridsystems may be employed, including but not limited to SOS-RAS system(SRS), Ras recruitment system (RRS), and ubiquitin split system.Brachmann and Boeke (1997) “Tag games in yeast: the two-hybrid systemand beyond” Current Opinion Biotech. 8:561-568. In thesenon-conventional yeast two-hybrid systems, the first or secondpolypeptide subunit may further comprise a signaling domain forscreening the library of the protein complexes based thesenon-conventional two-hybrid methods. Examples of such signaling domainincludes but are not limited to a Ras guanyl nucleotide exchange factor(e.g. human SOS factor), a membrane targeting signal such as amyristoylation sequence and farnesylation sequence, mammalian Raslacking the carboxy-terminal domain (the CAAX box), and a ubiquitinsequence.

[0300] SRS and RRS systems are alternative two-hybrid systems forstudying protein-protein interaction in cytoplasm. Both systems use ayeast strain with temperature-sensitive mutation in the cdc25 gene, theyeast homologue of human Sos (hSos). This protein, a guanyl nucleotideexchange factor, binds and activates Ras, that triggers the Rassignaling pathway. The mutation in the cdc25 protein is temperaturesensitive; the cells can grow at 25° C. but not at 37° C. In the SRSsystem, this cdc25 mutation is complemented by the hSos gene product toallow growth at 37° C., providing that the hSos protein is localized tothe membrane via a protein-protein interaction (Aronheim et al. 1997,Mol. Cel. Biol. 17:3094-3102). In the RRS system, the mutation iscomplemented by a mammalian activated Ras with its CAAX box at itscarboxy terminus upon recruitment to the plasma membrane viaprotein-protein interaction (Broder et al, 1998, Current Biol.8:1121-1124).

[0301] For example, the library of expression vectors encoding humanantibody library can be constructed for the selection based on the SRSsystem. A vector, pMyr (Stratagene, CA), is modified by replacing the f1origin region of pMyr expression cassette with MET25 promoter and PGKterminator from pBridge-1 (described in EXAMPLE) through homologousrecombination, resulting in pMyr-DC. The light chain sequence is clonedinto the MCS site downstream from myristoylation signal sequence usingligation-based approach. The heavy chain us cloned into the MCSdownstream from the MET25 promoter by homologous recombination. Thelibrary is made in the mutant cdc25H αstrain (Stratagene, CA). Themyristoylation signal anchors the antibody fusion proteins to the plasmamembrane. DNA encoding the target protein is cloned into the MCS of pSosvector, which is available from Stratagene. Such construct expresses afusion protein of hSos and the target protein.

[0302] The antibody library can be screened by co-transformation of thepSos with the target sequence into the cdc25H α strain. The transformedyeast cells are incubated under the restrictive temperature of 37° C. onthe yeast medium plate with galactose and low concentration ofmethionine, since the antibody expressions are under the controls ofGAL1 and MET25 promoters, respectively.

[0303] The antibody library can also be screened by yeast mating. ThepSos vector with bait sequence is first transformed into cdc25H a strain(available from Stratagene). The transformed a strain is then mated withthe α strain containing the antibody library, followed by incubation ofthe mated yeast cells incubated under the restrictive temperature of 37°C. on the yeast medium plate with galactose and low concentration ofmethionine.

[0304] Alternatively, the antibody library can be made in the modifiedpSos. The target protein is cloned into the pMyr. Library screening canbe performed similarly either by co-transformation or by mating.

[0305] 4. Selection of Affinity Binding Pairs between the Library ofProtein Complexes of the Present Invention and Target Nucleic Acids

[0306] As described above, the libraries of V1 and V2 sequences of thepresent invention can be used for selecting protein-protein orprotein-peptide binding pairs against single or arrayed multipleprotein/peptide targets in a two-hybrid screening system. As describedin the following, these libraries can also be used for selectingprotein-DNA or protein-RNA binding pairs in a one-hybrid system orthree-hybrid system, respectively.

[0307] The general scheme for screening protein-DNA binding pair usingan one-hybrid system is described in Li and Herskowitz (1993) Science262:1870-1874. Typically, this method is used to identify genes encodingproteins that recognize a specific DNA sequence. A library of randomprotein segments tagged with a transcriptional activation domain (AD) isscreened for proteins that can activate a reporter gene containing thespecific DNA sequence in its promoter region. By using this strategy, anessential protein that interacts in vivo with the yeast origin of DNAreplication was identified. In a three-hybrid system, the target nucleicacid is RNA or RNA-associated proteins. SanGupta, et al. (1996) Proc.Natl. Acad. Sci. USA 93:8496-8501.

[0308] The present invention provides a method is provided for screeningprotein-DNA binding pairs in a yeast one-hybrid system.

[0309] In an embodiment, the method comprises: expressing a library oftester protein complexes in yeast cells which contain a reporterconstruct comprising a reporter gene whose expression is under atranscriptional control of a target DNA sequence; and selecting theyeast cells in which the reporter gene is expressed, the expression ofthe reporter gene being activated by binding of the tester proteincomplex to the target DNA sequence.

[0310] In a variation of the embodiment, the step of expressing thelibrary of tester protein complexes includes transforming into the yeastcells a library of tester expression vectors for the library of testerfusion proteins. Each of the tester expression vector comprises atranscription sequence encoding an activation domain of a transcriptionactivator, a first nucleotide sequence V1 encoding the first polypeptidesubunit, and a second nucleotide sequence V2 encoding the secondpolypeptide subunit, the first and second nucleotide sequences varyingindependently within the library of tester expression vectors. Thetranscriptional activation domain AD and the first polypeptide subunitare expressed as a fusion protein. The first and second polypeptidesubunits are expressed as separate proteins, and form the tester proteincomplex upon binding with each other through non-covalent interactions(e.g. hydrophobic interactions) or covalent interactions (e.g. disulfidebonds).

[0311] In another variation of the embodiment, the step of expressing alibrary of tester protein complexes in yeast cells includes causingmating between a first and second populations of haploid yeast cells ofopposite mating types. The first population of haploid yeast cellscomprises a library of tester expression vectors for the library oftester protein complexes described above. The second population ofhaploid yeast cells comprises the reporter construct.

[0312] According to the variation, the haploid yeast cells of oppositemating types may preferably be α and a type strains of yeast. The matingbetween the first and second populations of haploid yeast cells of α anda type strains may preferably conducted in a rich nutritional culturemedium.

[0313] According to any of the above-described methods for selectingprotein-DNA binding pairs, the target DNA sequence in the reporterconstruct may preferably be positioned in 2-6 tandem repeats 5′ relativeto the reporter gene.

[0314] The target DNA sequence in the reporter construct may bepreferably between about 15-75 bp in length and more preferably betweenabout 25-55 bp in length.

[0315]FIG. 6 illustrates a flow diagram of a preferred embodiment of theabove-described method. As illustrated in FIG. 6, the tester sequencelibrary containing V1 fused with an AD domain upstream and V2 is carriedby a library of expression vectors, the AD-V1/V2 vector. The target DNAsequence (labeled “Target DNA”) is positioned in the promoter region ofa reporter gene (labeled “Reporter”).

[0316] The AD-V1/V2 vector is transformed into a yeast cell by usingmethods known in the art. Gietz, D. et al. (1992) “Improved method forhigh efficiency transformation of intact yeast cells” Nucleic Acids Res.20:1425. The construct carrying the target DNA sequence and the reportergene may be stably integrated into the genome of the host cell ortransiently transformed into the host cell.

[0317] As illustrated in FIG. 6, upon expression of the tester sequencesin the expression vectors, the library of tester protein complexesformed between AD-V1 fusion and V2, labeled as the AD-V1/V2 fusionprotein complexes, undergo protein folding in the host cell and adoptvarious conformations. Some of the AD-V1/V2 protein complexes may bindto the target DNA sequence in the promoter region of the reporter gene,thereby bringing the AD domain to a close proximity in the promoterregion. As a result, the AD activates the transcription of the reportergene downstream from the target DNA sequence, resulting in expression ofthe reporter gene, such as the lacZ reporter gene. Clones showing thephenotype of the reporter gene expression are selected, and the AD-V1/V2vectors are isolated. The coding sequences for V1 and V2 are identifiedand characterized.

[0318] Alternatively, the AD-V1/V2 vector and the reporter construct maybe introduced a diploid yeast cell by-mating between two haploid yeaststrains. For example, the AD-V1/V2 vector may be transformed into ahaploid yeast strain such as the α strain; and the reporter constructmay be transformed into another haploid yeast strain such as the astrain. Upon mating between these two haploid strains, diploid cells areformed to merge the genetic materials carried by the two haploid cells.As a result, the AD-V1/V2 vector and the reporter construct areintroduced into a diploid cell which is then screened for positiveinteractions between the tester protein and the target DNA in the cell.

[0319] The target DNA sequence may be a regulatory element, or aputative chromosome remodeling protein complex opening site, preferablyin a short stretch of DNA sequence (20-80 bp). The target DNA sequencemay be cloned into a yeast one-hybrid system reporter vector, e.g., pHIS(Clontech, Palo Alto, Calif.; Luo et al. (1996) “Cloning and analysis ofDNA-binding proteins by yeast one-hybrid and one-two-hybrid system”Biotechniques 20:564-568). To increase the sensitivity, the targetsequence may be cloned as in a few tandem repeats (e.g., 4-5 copies)into the reporter vector. The recombinant reporter vector may beintegrated into the yeast reporter strain by a transformation withlinearized vector and selection for rescuing the integration marker. Theintegration should be at a single chromosome location and usually athigh efficiency.

[0320] The tester sequence library containing V1 and V2 may encode anantibody library that can be used to screen against a target DNAantigen. The antibody expression library may be introduced into yeast bytransformation or by mating with the yeast strain of the opposite matingtype and harboring the reporter construct. The transformation and matingprocedures are described in detail in Example 3. Pre-screening ofself-activating clones may be necessary for eliminating the falsepositive clones. The procedures are similar to the two-hybrid librarypre-screening described in Section 3.

[0321] The library clones isolated from such a one-hybrid systemscreening may indicate that antibody(s) expressed from these clones arecapable of binding to the DNA target. Such antibody may be havesignificant applications in DNA vaccine and diagnostics of diseases.

[0322] The one-hybrid system of the present invention may also bemodified to screen for novel co-factors that bind to a known DNA-bindingfactor. The library of protein complexes formed between AD-V1 fusion andV2 subunit may be screened for affinity binding toward a specific factorthat binds to a DNA sequence in the promoter region of a reporter gene.

[0323] In yet another embodiment, a method is provided for screeningprotein-protein binding pairs in a yeast one-hybrid system. The methodcomprises: expressing a library of tester protein complexes in yeastcells which contain a reporter construct comprising a reporter genewhose expression is under a transcriptional control of a specific DNAbinding site; expressing a target protein in the yeast cells expressingthe tester protein complexes, where the target protein binds to thespecific DNA binding site; and selecting the yeast cells in which thereporter gene is expressed, the expression of the reporter gene beingactivated by binding of the tester protein complex to the targetprotein.

[0324] In a variation of the embodiment, the step of expressing thelibrary of tester protein complexes includes transforming into the yeastcells a library of tester expression vectors for the library of testerfusion proteins. Each of the tester expression vector comprises atranscription sequence encoding an activation domain of a transcriptionactivator, a first nucleotide sequence V1 encoding the first polypeptidesubunit, and a second nucleotide sequence V2 encoding the secondpolypeptide subunit, the first and second nucleotide sequences varyingindependently within the library of tester expression vectors. Thetranscriptional activation domain AD and the first polypeptide subunitare expressed as a fusion protein. The first and second polypeptidesubunits are expressed as separate proteins, and form the tester proteincomplex upon binding with each other through non-covalent interactions(e.g. hydrophobic interactions) or covalent interactions (e.g. disulfidebonds).

[0325] In another variation of the embodiment, the steps of expressingthe library of tester protein complexes and expressing the target fusionprotein includes-causing mating between a first and second populationsof haploid yeast cells of opposite mating types. The first population ofhaploid yeast cells comprises a library of tester expression vectors forthe library of tester protein complexes described above. The secondpopulation of haploid yeast cells comprises a target expression vectorcomprising a target sequence encoding the target protein. Either thefirst or second population of haploid yeast cells comprises the reporterconstruct.

[0326]FIG. 7 illustrates a flow diagram of a preferred embodiment of theabove-described method. As illustrated in FIG. 8, the tester sequencelibrary containing V1 fused with an AD domain upstream (AD-V1 fusion)and V2 is carried by a library of expression vectors, the AD-V1/V2vector. The AD-V1/V2 vectors are introduced into host cells, forexample, by transformation. The target protein (labeled “Target”) thatis known to bind to a specific DNA sequence may be expressed by anexpression vector in the host cells or otherwise present in the cells.The specific DNA sequence (labeled “*DNA”) is positioned in the promoterregion of a reporter gene (labeled “Reporter”). The construct carryingthe specific DNA sequence and the reporter gene may be stably integratedinto the genome of the host cell or transiently transformed into thehost cell.

[0327] As illustrated in FIG. 7, upon expression of the tester sequencesin the expression vectors, the library of tester protein complexesformed between AD-V1 fusion and V2, labeled as the AD-V1/V2 proteincomplexes, undergo protein folding in the host cell and adopt variousconformations. Some of the AD-V1/V2 fusion proteins may bind to thetarget protein that binds to the specific DNA sequence in the promoterregion of the reporter gene, thereby bringing the AD domain to a closeproximity in the promoter region. As a result, the AD activates thetranscription of the reporter gene downstream from the target DNAsequence, resulting in expression of the reporter gene, such as the lacZreporter gene. Clones showing the phenotype of the reporter geneexpression are selected, and the AD-V1/V2 vectors are isolated. Thecoding sequences for V1 and V2 are identified and characterized.

[0328] The specific target protein may be any protein that has beencharacterized to be a DNA-binding fact by using various assays such asin vitro gel shifting assays, or through conventional one-hybridscreening. The target protein (without being fused to an AD domain) maybe expressed in the yeast one-hybrid reporter strain. The level oftarget protein expression is then adjusted to such an extent that nomeasurable activation is observed. The yeast strain may also contain thereporter construct that is integrated into the yeast genome.

[0329] The tester sequence library containing V1 and V2 may encode alibrary of antibody that can be used to screen against a target proteinthat a DNA-binding factor. The library clones isolated from such amodified one-hybrid system screening may indicate that antibody(s)expressed from these clones are capable of binding to the proteintarget. Such antibody may be have significant applications intherapeutics and diagnostics of diseases.

[0330] 5. High Throughput Selection of Affinity Binding Pairs betweenthe Library of Protein Complexes of the Present Invention and a Libraryof Target Proteins

[0331] The present invention also provides a method for high throughputscreening of the above-described libraries of protein complexes encodedby V1 and V2. The library of expression vectors, for example, theAD-antibody yeast expression vector library, may be screen for thebinding of the antibodies to multiple target proteins expressed by ayeast clone library (BD-Target library), each clone carrying a BD-Targetvector for each target protein to be selected against. The BD-Targetclone library may be arrayed in multiple-well plates, such as 96- and384-well plates, and then screened against the antibody library in anautomated and high throughput manner.

[0332] For example, a collection of EST clones (or a total library ofEST) from human, mouse or other organisms may be screened against theantibody library generated by using the methods of the presentinvention. Such a collection of EST clones may be ordered from a publicresource in a library format with individually clones arrayed in 96-wellor 384-well plates. Lennon, G. et al. (1996) “The I.M.A.G.E. Consortium:an integrated molecular analysis of genomes and their expression”Genomics 33:151-152. The EST inserts from the original collection(usually in bacterial cloning and sequencing vectors) may be PCRamplified with extended homologous sequences at both ends followingsimilar procedures used in the generation of the antibody library.Through the same homologous recombination procedure as used in thegeneration of the antibody library, the EST inserts are inserted into anexpression vector containing a BD domain of a transcription activator inyeast cells.

[0333] Optionally, a collection of certain domain structures, such aszinc finger and helix-loop-helix protein domains, may be inserted intothe BD-containing expression vector in yeast cell via homologousrecombination. The yeast clones containing the vector with BD fused toeach domain structure may be arrayed in multiple-well plates andscreened against the antibody library for affinity binding between theantibody and each domain structure. The domain structure may be 18-20amino acids at length and its sequence may not be totally random. Such acollection of domain structures may be generated by using syntheticoligonucleotides with characteristic conserved and random/degenerateresidues to cover most of the rational domain structures.

[0334] Also optionally, the coding sequences of a random peptide librarymay be inserted into the BD-containing expression vector in yeast cellvia homologous recombination. The yeast clones containing the vectorwith BD fused to each random peptide may be arrayed in multiple-wellplates and screened against the antibody library for affinity bindingbetween the antibody and each random peptide target. The random peptidemay be 16-20 amino acid at length. Such a library of random peptide cangenerated by random oligonucleotide synthesis or by partially randomoligonucleotide synthesis biased toward a sequence encoding a specifictarget.

[0335] Alternatively, a library of short peptides may also be may beinserted into the BD-containing expression vector in yeast cell viahomologous recombination. Accordingly, the antibody library may be fusedwith the BD domain in the expression vector and screened against thislibrary of short peptide. Through this selection, peptide ligands may beselected for each antibody. Structural and functional analysis of theselected peptides should aid in the rational design of antigens andstructural improvement of specific target antigens.

[0336]FIG. 8 depicts a general scheme of high throughput screening of alibrary of V/V2 protein complexes against a library of target proteinsin yeast via mating of two strains of yeast haploid cells.

[0337] As illustrated in FIG. 8, the each member of the library oftarget proteins or peptides is fused with the BD domain of an expressionvector contained in yeast a-type of host strain.

[0338] The yeast clones of the library of target proteins may be arrayedas a clone library. This may be achieved by depositing each clonecontaining the BD-Target fusion into a well of a 96- or 384-well plate.Optionally, prior to using this library of BD-Target clones, theBD-Target library may be preselected to filter out any self-activatingclones. This selection may be accomplished by allowing the yeast clonesthat contain the BD-Target fusion to grow in a selection medium used fortwo-hybrid selection at a later stage, such as the medium SD/-Trp-His.The clones are checked for self-activation of the reporter gene in theabsence of the AD domain.

[0339] Alternatively, the BD-Target library may be preselected in aselection medium with β- or α-galactosidase substrate. Any positiveclones will produce a colored reaction catalyzed the galactosidaseexpressed from a LacZ reporter gene and can be easily detected by nakedeyes or by an instrument. Such clones are self-activating clones thatexpress the reporter gene in the absence of the AD domain. The clonesmay be excluded from the library of BD-Target clones.

[0340] Still referring to FIG. 8, the BD-target clones of a-strain ofyeast may be inoculated into a plate which is pre-seeded with an arrayedlibrary of V1/V2 library of α-strain of yeast haploid cells. The twohaploid yeast strains mate in the rich medium and form diploid. Theparental clones are screened for expression of the reporter gene whichindicates positive interactions between a V1/V2 protein complex and atarget protein expressed by the clones in the same well. The scoring ofthe positive clones may be conveniently carried out by machine-aidedautomatic screening using β- or α-galactosidase substrate. Aho, S. etal. (1997) “A novel reporter gene MEL1 for the yeast two-hybrid system”Anal. Biochem. 253:270-272.

[0341] Compared to the screening of a single target protein against alibrary of V1/V2 protein complexes, the method illustrated in FIG. 8 isbased on clonal mating, i.e., mating between an individual targetprotein against an individual V1/V2 protein complex. The advantage ofsuch clonal mating is that the efficiency of mating and selection may beenhanced through clonal mating when large numbers of target proteins andV1/V2 protein complexes such as antibodies are involved.

[0342] The methods described can be used for large scale screening oflibraries of biomolecules, such as fully human antibody repertoires,against a wide variety target molecules or ligands. The screeningprocess may be automated for high throughput screening of thebiomolecules. For example, such screening process allows for efficientisolation and collection of antibodies against any EST (human, mouse, orany other organisms), or any known structural/functional protein domains(Zinc finger, helix-loop-helix, etc.), or totally random peptides withvarious lengths.

[0343] In contrast, by using conventional methods for screening antibodyin vivo, such as the hybridoma and “XENOMOUSE” technologies, such alarge-scale and comprehensive antibody collection may have beenimpractical due to technical limitations associated with using animal asthe host for the libraries of antibodies and target molecules.

[0344] By using the method of the present invention, the antibodyrepertoires can be screened for affinity interaction between an antibodyin the library and a target antigen individually in vivo by clonalmating without losing track of individual clones. The screening shouldbe more efficient than the procedure performed on mice, owing the tofast proliferation rate and ease of handling of yeast cells.

[0345] The method of the present invention should provide vary usefultools for profiling functions of genes, in particular, functionalproteomics, efficiently and economically. With the completion of humangenome sequencing, the demands are tremendous for efficient large-scalescreening for functional proteins aimed at large numbers of targetmolecules. The high affinity and functional antibodies, as well as othermultimeric proteins, that are selected by using the methods of thepresent invention should find a wide variety applications in prevention,diagnosis, therapeutic treatment of diseases and in other biomedical orindustrial uses.

[0346] 6. Mutagenesis of the Fusion Protein Leads Positively Selectedagainst Target Protein(s)

[0347] As described above, protein leads, such as dsFv, Fab or antibodyleads, can be identified through selection of the primary librarycarrying V1 and V2 against one or more target proteins. The codingsequences of these protein leads may be mutagenized in vitro or in vivoto generated a secondary library more diverse than these leads. Themutagenized leads can be selected against the target protein(s) again invivo following similar procedures described for the selection of theprimary library carrying V1 and V2. Such mutagenesis and selection ofprimary antibody leads effectively mimics the affinity maturationprocess naturally occurring in a mammal that produces antibody withprogressive increase in the affinity to the immunizing antigen.

[0348] The coding sequences of the fusion protein leads may bemutagenized by using a wide variety of methods. Examples of methods ofmutagenesis include, but are not limited to site-directed mutagenesis,error-prone PCR mutagenesis, cassette mutagenesis, random PCRmutagenesis, DNA shuffling, and chain shuffling.

[0349] Site-directed mutagenesis or point mutagenesis may be used togradually change the V1 and V2 sequences in specific regions. This isgenerally accomplished by using oligonucleotide-directed mutagenesis.For example, a short sequence of an antibody lead may be replaced with asynthetically mutagenized oligonucleotide in either the heavy chain orlight chain region or both. The method may not be efficient formutagenizing large numbers of V1 and V2 sequences, but may be used forfine toning of a particular lead to achieve higher affinity toward aspecific target protein.

[0350] Cassette mutagenesis may also be used to mutagenize the V1 and V2sequences in specific regions. In a typical cassette mutagenesis, asequence block, or a region, of a single template is replaced by acompletely or partially randomized sequence. However, the maximuminformation content that can be obtained may be statistically limited bythe number of random sequences of the oligonucleotides. Similar to pointmutagenesis, this method may also be used for fine toning of aparticular lead to achieve higher affinity toward a specific targetprotein.

[0351] Error-prone PCR, or “poison” PCR, may be used to the V1 and V2sequences by following protocols described in Caldwell and Joyce (1992)PCR Methods and Applications 2:28-33. Leung, D. W. et al. (1989)Technique 1:11-15. Shafikhani, S. et al. (1997) Biotechniques23:304-306. Stemmer, W. P. et al. (1994) Proc. Natl. Acad. Sci. USA91:10747-10751.

[0352]FIG. 9 illustrates an example of the method of the presentinvention for affinity maturation of antibody leads selected from theprimary antibody library. As illustrated in FIG. 9, the coding sequencesof the antibody leads selected from clones containing the primarylibrary are mutagenized by using a poison PCR method. Since the codingsequences of the antibody library are contained in the expressionvectors isolated from the selected clones, one or more pairs of PCRprimers may be used to specifically amplify the V_(H) and V_(L) regionout of the vector. The PCR fragments containing the V_(H) and V_(L)sequences are mutagenized by the poison PCR under conditions that favorsincorporation of mutations into the product.

[0353] Such conditions for poison PCR may include a) high concentrationsof Mn²⁺ (e.g. 0.4-0.6 mM) that efficiently induces malfunction of TaqDNA polymerase; and b) disproportionally high concentration of onenucleotide substrate (e.g., dGTP) in the PCR reaction that causesincorrect incorporation of this high concentration substrate into thetemplate and produce mutations. Additionally, other factors such as, thenumber of PCR cycles, the species of DNA polymerase used, and the lengthof the template, may affect the rate of mis-incorporation of “wrong”nucleotides into the PCR product. Commercially available kits may beutilized for the mutagenesis of the selected antibody library, such asthe “Diversity PCR random mutagenesis kit” (catalog No. K1830-1,Clontech, Palo Alto, Calif.).

[0354] The PCR primer pairs used in mutagenesis PCR may preferablyinclude regions matched with the homologous recombination sites in theexpression vectors. This design allows re-introduction of the PCRproducts after mutagenesis back into the yeast host strain again viahomologous recombination. This also allows the modified V_(H) or V_(L)region to be fused with the AD domain directly in the expression vectorin the yeast.

[0355] Still referring to FIG. 9, the mutagenized scFv fragments areinserted into the expression vector containing an AD domain viahomologous recombination in haploid cells of αtype yeast strain.Similarly to the selection of antibody clones from the primary antibodylibrary, the AD-antibody containing haploid cells are mated with haploidcells of opposite mating type (e.g. a type) that contains the BD-Targetvector and the reporter gene construct. The parental diploid cells areselected based on expression of the reporter gene and other selectioncriteria as described in detail in Section 3.

[0356] Other PCR-based mutagenesis method can also be used, alone or inconjunction with the poison PCR described above. For example, the PCRamplified V_(H) and V_(L) segments may be digested with DNase to createnicks in the double DNA strand. These nicks can be expanded into gaps byother exonucleases such as Bal 31. The gaps may be then be filled byrandom sequences by using DNA Klenow polymerase at low concentration ofregular substrates dGTP, dATP, dTTP, and dCTP with one substrate (e.g.,dGTP) at a disproportionately high concentration. This fill-in reactionshould produce high frequency mutations in the filled gap regions. Thesemethod of DNase I digestion may be used in conjunction with poison PCRto create highest frequency of mutations in the desired V_(H) and V_(L)segments.

[0357] The PCR amplified V_(H) and V_(L) segments or antibody heavychain and light chain segments may be mutagenized in vitro by using DNAshuffling techniques described by Stemmer (1994) Nature 370:389-391; andStemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. The V_(H),V_(L) or antibody segments from the primary antibody leads are digestedwith DNase I into random fragments which are then reassembled to theiroriginal size by homologous recombination in vitro by using PCR methods.As a result, the diversity of the library of primary antibody leads areincreased as the numbers of cycles of molecular evolution increase invitro.

[0358] The V_(H), V_(L) or antibody segments amplified from the primaryantibody leads may also be mutagenized in vivo by exploiting theinherent ability of mution in pre-B cells. The Ig gene in pre-B cells isspecifically susceptible to a high-rate of mutation in the developmentof pre-B cells. The Ig promoter and enhancer facilitate such high ratemutations in a pre-B cell environment while the pre-B cells proliferate.Accordingly, V_(H) and V_(L) gene segments may be cloned into amammalian expression vector that contains human Ig enhancer andpromoter. This construct may be introduced into a pre-B cell line, suchas 38B9, which allows the mutation of the V_(H) and V_(L) gene segmentsnaturally in the pre-B cells. Liu, X., and Van Ness, B. (1999) Mol.Immunol. 36:461-469. The mutagenized V_(H) and V_(L) segments can beamplified from the cultured pre-B cell line and re-introduced back intothe AD-containing yeast strain via, for example, homologousrecombination.

[0359] The secondary antibody library produced by mutagenesis in vitro(e.g. PCR) or in vivo, i.e., by passing through a mammalian pre-B cellline may be cloned into an expression vector and screened against thesame target protein as in the first round of screening using the primaryantibody library. For example, the expression vectors containing thesecondary antibody library may be transformed into haploid cells ofα-type yeast strain. These α cells are mated with haploid cells a typeyeast strain containing the BD-target expression vector and the reportergene construct. The positive interaction of antibodies from thesecondary antibody library is screened by following similar proceduresas described for the selection of the primary antibody leads in yeast.

[0360] Alternatively, since the secondary antibody library may berelatively low in complexity (e.g.,10⁴-10⁵ independent clones) ascompared to the primary libraries (e.g.,10⁷-10¹⁴), the screening of thesecondary antibody library may be performed without mating between twoyeast strains. Instead, the linearized expression vectors containing theAD domain and the mutagenized V_(H) and V_(L) segments may be directlyco-transformed into yeast cells containing the BD-target expressionvector and the reporter gene construct. Via homologous recombination inyeast, the secondary antibody library are expressed by the recombinedAD-antibody vector and screened against the target protein expressed bythe BD-target vector by following similar procedures as described forthe selection of the primary antibody leads in yeast.

[0361] 7. Functional Expression and Purification of Selected Antibody

[0362] The library of proten complexes encoded by V1 and V2 that aregenerated and selected in the screening against the target protein(s)may be expressed in hosts after the V1 and V2 sequences are operablylinked to an expression control DNA sequence, includingnaturally-associated or heterologous promoters, in an expression vector.By operably linking the V1 and V2 sequences to an expression controlsequence, the V1 and V2 coding sequences are positioned to ensure thetranscription and translation of these inserted sequences. Theexpression vector may be replicable in the host organism as episomes oras an integral part of the host chromosomal DNA. The expression vectormay also contain selection markers such as antibiotic resistance genes(e.g. neomycin and tetracycline resistance genes) to permit detection ofthose cells transformed with the expression vector.

[0363] Preferably, the expression vector may be a eukaryotic vectorcapable of transforming or transfecting eukaryotic host cells. Once theexpression vector has been incorporated into the appropriate host cells,the host cells are maintained under conditions suitable for high levelexpression of protein complexes encoded by V1 and V2, such as dcFv, Faband antibody. The polypeptides expressed are collected and purifieddepending on the expression system used.

[0364] The dcFv, Fab, or fully assembled antibodies selected by usingthe methods of the present invention may be expressed in various scalesin any host system. FIG. 11 illustrates examples of host systems:bacteria (e.g. E. coli), yeast (e.g. S. cerevisiae), and mammalian cells(COS). The bacteria expression vector may preferably contain thebacterial phage T7 promoter and express either the heavy chain and/orlight chain region of the selected antibody. The yeast expression vectormay contain a constitutive promoter (e.g. ADGI promoter) or an induciblepromoter such as (e.g. GCN4 and Gal 1 promoters). All three types ofantibody, dcFv, Fab, and full antibody, may be expressed in a yeastexpression system.

[0365] The expression vector may be a mammalian express vector that canbe used to express the protein complexes encoded by V1 and V2 inmammalian cell culture transiently or stably. Examples of mammalian celllines that may be suitable of secreting immunoglobulins include, but arenot limited to, various COS cell lines, HeLa cells, myeloma cell lines,CHO cell lines, transformed B-cells and hybridomas.

[0366] Typically, a mammalian expression vector includes certainexpression control sequences, such as an origin of replication, apromoter, an enhancer, as well as necessary processing signals, such asribosome binding sites, RNA splice sites, polyadenylation sites, andtranscriptional terminator sequences. Examples of promoters include, butare not limited to, insulin promoter, human cytomegalovirus (CMV)promoter and its early promoter, simian virus SV40 promoter, Roussarcoma virus LTR promoter/enhancer, the chicken cytoplasmic β-actinpromoter, promoters derived from immunoglobulin genes, bovine papillomavirus and adenovirus.

[0367] One or more enhancer sequence may be included in the expressionvector to increase the transcription efficiency. Enhancers arecis-acting sequences of between 10 to 300 bp that increase transcriptionby a promoter. Enhancers can effectively increase transcription whenpositioned either 5′ or 3′ to the transcription unit. They may also beeffective if located within an intron or within the coding sequenceitself. Examples of enhancers include, but are not limited to, SV40enhancers, cytomegalovirus enhancers, polyoma enhancers, the mouseimmunoglobulin heavy chain enhancer. and adenovirus enhancers.

[0368] The mammalian expression vector may also typically include aselectable marker gene. Examples of suitable markers include, but arenot limited to, the dihydrofolate reductase gene (DHFR), the thymidinekinase gene (TK), or prokaryotic genes conferring antibiotic resistance.The DHFR and TK genes prefer the use of mutant cell lines that lack theability to grow without the addition of thymidine to the growth medium.Transformed cells can then be identified by their ability to grow onnon-supplemented media. Examples of prokaryotic drug resistance genesuseful as markers include genes conferring resistance to G418,mycophenolic acid and hygromycin.

[0369] The expression vectors containing the V1 and V2 sequences canthen be transferred into the host cell by methods known in the art,depending on the type of host cells. Examples of transfection techniquesinclude, but are not limited to, calcium phosphate transfection, calciumchloride transfection, lipofection, electroporation, and microinjection.

[0370] The V1 and V2 sequences may also be inserted into a viral vectorsuch as adenoviral vector that can replicate in its host cell andproduce the polypeptide encoded by V1 and V2 in large amounts.

[0371] In particular, as illustrated in FIG. 11, the dcFv, Fab, or fullyassembled antibody may be expressed in mammalian cells by using a methoddescribed by Persic et al. (1997) Gene, 187:9-18. The mammalianexpression vector that is described by Persic and contains EF-α promoterand SV40 replication origin is preferably utilized. The SV40 originallows a high level of transient expression in cells containing large Tantigen such as COS cell line. The expression vector may also includesecretion signal and different antibiotic markers (e.g. neo and hygro)for integration selection.

[0372] Once expressed, polypeptides encoded by V1 and V2 may be isolatedand purified by using standard procedures of the art, including ammoniumsulfate precipitation, fraction column chromatography, and gelelectrophoresis. Once purified, partially or to homogeneity as desired,the polypeptides may then be used therapeutically or in developing,performing assay procedures, immunofluorescent stainings, and in otherbiomedical and industrial applications. In particular, the antibodiesgenerated by the method of the present invention may be used fordiagnosis and therapy for the treatment of various diseases such ascancer, autoimmune diseases, or viral infections.

[0373] In a preferred embodiment, the human antibodies that aregenerated and screened by using the methods of the present invention maybe expressed directly in yeast. According to this embodiment, the heavychain and light chain regions from the selected expression vectors maybe PCR amplified with primers that simultaneously add appropriatehomologous recombination sequences to the PCR products. These PCRsegments of heavy chain and light chain may then be introduced into ayeast strain together with a linearized expression vector containingdesirable promoters, expression tags and other transcriptional ortranslational signals.

[0374] For example, the PCR segments of heavy chain and light chainregions may be homologously recombined with a yeast expression vectorthat already contains a desirable promoter in the upstream and stopcodons and transcription termination signal in the downstream. Thepromoter may be a constitutive expression promoter such as ADH1, or aninducible expression promoter, such as Gal 1, or GCN4 (A. Mimran, I.Marbach, and D. Engelberg, (2000) Biotechniques 28:552-560). The latterinducible promoter may be preferred because the induction can be easilyachieved by adding 3-AT into the medium.

[0375] The yeast expression vector to be used for expression of theantibody may be of any standard strain with nutritional selectionmarkers, such as His 3, Ade 2, Leu 2, Ura 3, Trp 1 and Lys 2. The markerused for the expression of the selected antibody may preferably bedifferent from the AD vector used in the selection of antibody in thetwo-hybrid system. This may help to avoid potential carryover problemassociated with multiple yeast expression vectors.

[0376] For. expressing the dcFv antibody in a secreted form in yeast,the expression vector may include a secretion signal in the 5′ end ofthe V_(H) and V_(L) segments, such as an alpha factor signal and a 5-phosecretion signal. Certain commercially available vectors that contain adesirable secretion signal may also be used (e.g., pYEX-S1, catalog#6200-1, Clontech, Palo Alto, Calif.).

[0377] The dcFv antibody fragments generated may be analyzed andcharacterized for their affinity and specificity by using methods knownin the art, such as ELISA, western, and immune staining. Those dcFvantibody fragments with reasonably good affinity (with dissociationconstant preferably above 10⁻⁶ M) and specificity can be used asbuilding blocks in Fab expression vectors, or can be further assembledwith the constant region for full length antibody expression. Thesefully assembled human antibodies may also be expressed in yeast in asecreted form.

[0378]FIG. 10A illustrates the secondary structures of the dcFv, Fab anda fully assembled antibody. The V_(H) sequence encoding the selecteddcFv protein may be linked with the constant regions of a full antibody,C_(H)1, C_(H)2 and C_(H)3. Similarly, the V_(L) sequence may be linkedwith the constant region C_(L). The assembly of two units ofV_(H)−C_(H)−C_(H)2−C_(H)3 and V_(L)−C_(L) leads to formation of a fullyfunctional antibody. The present invention provides a method forproducing fully functional antibody in yeast. Fully functional antibodyretaining the rest of the constant regions may have a higher affinity(or avidity) than a dcFv or a Fab. The full antibody should also have ahigher stability, thus allowing more efficient purification of antibodyprotein in large scale.

[0379] The method is provided by exploiting the ability of yeast cellsto uptake and maintain multiple copies of plasmids of the samereplication origin. According to the method, different vectors may beused to express the heavy chain and light chain separately, and yetallows for the assembly of a fully functional antibody in yeast. Thisapproach has been successfully used in a two-hybrid system design wherethe BD and AD vectors are identical in backbone structure except theselection markers are distinct. This approach has been used in atwo-hybrid system design for expressing both BD and AD fusion proteinsin the yeast. The BD and AD vectors are identical in their backbonestructures except the selection markers are distinct. Both vectors canbe maintained in yeast in high copy numbers. Chien, C. T., et al. (1991)“The two-hybrid system: a method to identify and clone genes forproteins that interact with a protein of interest” Proc. Natl. Acad.Sci. USA 88:9578-9582.

[0380] In the present invention, the heavy chain gene and light chaingenes are placed in two different vectors. Under a suitable condition,the V_(H)−C_(H)1−C_(H)2−C_(H)3 and V_(L)−C_(L) sequences are expressedand assembled in yeast, resulting in a fully functional antibody proteinwith two heavy chains and two light chains. This fully functionalantibody may be secreted into the medium and purified directly from thesupernatant.

[0381] The dcFv with a constant region, Fab, or fully assembled antibodycan be purified using methods known in the art. Conventional techniquesinclude, but are not limited to, precipitation with ammonium sulfateand/or caprylic acid, ion exchange chromatography (e.g. DEAE), and gelfiltration chromatography. Delves (1997) “Antibody Production: EssentialTechniques”, New York, John Wiley & Sons, pages 90-113. Affinity-basedapproaches using affinity matrix based on Protein A, Protein G orProtein L may be more efficiency and results in antibody with highpurity. Protein A and protein G are bacterial cell wall proteins thatbind specifically and tightly to a domain of the Fc portion of certainimmunoglobulins with differential binding affinity to differentsubclasses of IgG. For example, Protein G has higher affinities formouse IgG1 and human IgG3 than does Protein A. The affinity of Protein Aof IgG1 can be enhanced by a number of different methods, including theuse of binding buffers with increased pH or salt concentration. ProteinL binds antibodies predominantly through kappa light chain interactionswithout interfering with the antigen-binding site. Chateau et al. (1993)“On the interaction between Protein L and immunoglobulins of variousmammalian species” Scandinavian J. Immunol., 37:399-405. Protein L hasbeen shown to bind strongly to human kappa light chain subclasses I, IIIand IV and to mouse kappa chain subclasses I. Protein L can be used topurify relevant kappa chain-bearing antibodies of all classes (IgG, IgM,IgA, IgD, and IgE) from a wide variety of species, including human,mouse, rat, and rabbit. Protein L can also be used for the affinitypurification of scFv and Fab antibody fragments containing suitablekappa light chains. Protein L-based reagents is commercially availablefrom Actigen, Inc., Cambridgem, England. Actigen can provide a line ofrecombinant Protein products, including agarose conjugates for affinitypurification and immobilized forms of recombinant Protein L and A fusionprotein which contains four protein A antibody-binding domains and fourprotein L kappa-binding domains.

[0382] Other affinity matrix may also be used, including those thatexploit peptidomimetic ligands, anti-immunoglobulins, mannan bindingprotein, and the relevant antigen. Peptidomimetic ligands resemblepeptides but they do not correspond to natural peptides. Many ofPeptidomimetic ligands contain unnatural or chemically modified aminoacids. For example, peptidomimetic ligands designed for the affinitypurification of antibodies of the IGA and IgE classes are commerciallyavailable from Tecnogen, Piana di Monte Verna, Italy. Mannan bindingprotein (MBP) is a mannose- and N-acetylglucosamine-specific lectinfound in mammalian sera. This lectin binds IgM. The MBP-agarose supportfor the purification IgM is commercially available from Pierce.

[0383] Immunomagnetic methods that combine an affinity reagent (e.g.protein A or an anti-immunoglobulin) with the ease of separationconferred by paramagnetic beads may be used for purifying the antibodyproduced. Magnetic beads coated with Protein or relevant secondaryantibody may be commercially available from Dynal, Inc., NY; BangsLaboratories, Fishers, Ind.; and Cortex Biochem Inc., San Leandro,Calif.

[0384] Direct expression and purification of the selected antibody inyeast is advantageous in various aspects. As a eukaryotic organism,yeast is more of an ideal system for expressing human proteins thanbacteria or other lower organisms. It is more likely that yeast willmake the dcFv, Fab, or fully assembled antibody in a correctconformation (folded correctly), and will add post-translationmodifications such as correct disulfide bond(s) and glycosylations.

[0385] Yeast has been explored for expressing many human proteins in thepast. Many human proteins have been successfully produced from theyeast, such as human serum albumin (Kang, H. A. et al. (2000) Appl.Microbiol. Biotechnol. 53:578-582) and human telomerase protein and RNAcomplex (Bachand, F., et al. (2000) RNA 6:778-784).

[0386] Yeast has fully characterized secretion pathways. The geneticsand biochemistry of many if not all genes that regulate the pathwayshave been identified. Knowledge of these pathways should aid in thedesign of expression vectors and procedures for isolation andpurification of antibody expressed in the yeast.

[0387] Moreover, yeast has very few secreted proteases. This should keepthe secreted recombinant protein quite stable. In addition, since yeastdoes not secrete many other and/or toxic proteins, the supernatantshould be relatively uncontaminated. Therefore, purification ofrecombinant protein from yeast supernatant should be simple, efficientand economical.

[0388] Additionally, simple and reliable methods have been developed forisolating proteins from yeast cells. Cid, V. J. et al. (1998) “Amutation in the Rho&GAP-encoding gene BEM2 of Saccharomyces cerevisiaeaffects morphogenesis and cell wall functionality” Microbiol. 144:25-36.Although yeast has a relatively thick cell wall that is not present ineither bacterial or mammalian cells, the yeast cells can still keep theyeast strain growing with the yeast cell wall striped from the cells. Bygrowing the yeast strain in yeast cells without the cell wall, secretionand purification of recombinant human antibody may be made more feasibleand efficient.

[0389] By using yeast as host system for expression, a streamlinedprocess can be established to produce recombinant antibodies in fullyassembled and purified form. This may save tremendous time and effortsas compared to using any other systems such as humanization of antibodyin vitro and production of fully human antibody in transgenic animals.

[0390] In summary, the compositions, kits and methods provided by thepresent invention should be very useful for selecting proteins such ashuman antibodies with high affinity and specificity against a widevariety of targets including, but not limited to, soluble proteins (e.g.growth factors, cytokines and chemokines), membrane-bound proteins (e.g.cell surface receptors), and viral antigens. The whole process oflibrary construction, functional screening and expression of highlydiverse repertoire of human antibodies can be streamlined, andefficiently and economically performed in yeast in a high throughput andautomated manner. The selected proteins can have a wide variety ofapplications. For example, they can be used in therapeutics anddiagnosis of diseases including, but not limited to, autoimmunediseases, cancer, transplant rejection, infectious diseases andinflammation.

EXAMPLE Example 1

[0391] Construction of Expression Vectors Containing Human AntibodyLibrary Using Homologous Recombination in Vivo

[0392] The following illustrates examples of how to use generalhomologous recombination as an efficient way of constructing recombinanthuman antibody library. The coding sequence of each member of theantibody library includes heavy-chain and light chain regions derivedfrom a library of human antibody repertoire. The light chain region ofthe antibody is fused with a two-hybrid system activation domain (AD) toform a two-hybrid expression vector in the yeast. In an alternativedesign, the light chain region of the antibody is fused with Aga2subunit of yeast a-agglutinin to form a surface dislay expression vectorin the yeast. The heavy chain region of the antibody is expressedseparately from the light chain region by a different promoter.

[0393] 1) Isolation of Human Antibody cDNA Gene Pool

[0394] A complex human antibody cDNA gene pool is generated by using themethod described in Sambrook, J., et al. (1989) Molecular Cloning: alaboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y.; and Ausubel, F. M. et al. (1995) Current Protocols in MolecularBiology” John Wiley & Sons, NY.

[0395] Briefly, total RNA is isolated from the white cells (mainly Bcells) contained in peripheral blood supplied by un-immunized humans.Blood sample at 500 ml, which contains approximately 10⁸ B-lymphocytes,are obtained from healthy donors from Stanford Hospital Blood Center.The white blood cells are separated on Ficoll and RNA is isolated by amodified method. Sambrook, J., et al. (1989), supra; and Zhu, L. et al.(1997) “Yeast Gal 4 activation domain fusion expression libraries” in“The Yeast Two-Hybrid System”, S. Fields and P. Bartel, Ed., OxfordUniversity Press, pages 73-98.

[0396] If starting from tissue, RNA is first isolated using standardprocedures. Ramirez, F. et al. (1975) “Changes in globin messenger rNAcontent during erythroid cell differentiation” J. Biol. Chem.250:6054-6058; and Sambrook, J., et al. (1989), supra. First strand cDNAsynthesis is performed using the method of Marks et al. in which a setof heavy and light chain cDNA primers are designed to anneal to theconstant regions for priming the synthesis of cDNA of heavy chain andlight chains (both kappa and lambda) antibody genes in separate tubes.Marks et al. (1991) Eur. J. Immunol. 21:985-991.

[0397] Alternatively, human spleen, leukocyte, fetal liver, or bonemarrow cDNA can be purchased directly from commercial sources, such asClontech, Palo Alto, Calif.

[0398] 2) PCR Amplification of Heavy and Light Chain Genes

[0399] The coding sequences of human heavy and light chain genes areamplified from the cDNA library generated above by using a methoddescribed by Sblattero and Bradbury (1998) Immunotechnology 3:271-278.This method allows almost 100% coverage of all human V_(H), Vλ and Vκgenes from the known Ig gene database. Specifically, cDNA pool fromhuman spleen is used (human spleen Marathon-Ready cDNA, Cat. #7412-1,Clontech, Palo Alto, Calif.). Alternatively, cDNA pool from humanleukocytes can also be used (human leukocyte Marathon-Ready cDNA,catalog #7406-1, Clontech, Palo Alto, Calif.).

[0400] The genes encoding human antibody heavy chain and light chainregions are amplified separately by PCR using sets of mixed 5′ and 3′primers for each class of variable region fragment (Fv), fragmentantigen binding region (Fab) and full length heavy chain region (Ab).Primers used for PCR amplification of these regions of the heavy chainand light chain are listed in Table 2 and named as follows: Heavy ChainPrimers for Directional Cloning: Fv, 5' primers: Sequences VH5'1-7 [SEQID NO: 14-20] 3' primers: Sequences VH3'1-6 [SEQ ID NO: 21-26] Fab, 5'primers: Sequences FabH5'1-7 [SEQ ID NO: 14-20] 3' primers: SequencesFabH3'1 [SEQ ID NO: 27] Full length, 5' primers: Sequences AbH5'1-7 [SEQID NO: 14-20] 3' primers: Sequences AbH3'1 [SEQ ID NO: 28] Light ChainPrimers for Cloning into a Site Downstream of GAL-4 AD: λ chain Fv, 5'primers: Sequences Vλ5'1-9 [SEQ ID NO: 29-37] 3' primers: SequencesVλ3'1-2 [SEQ ID NO: 38-39] Full length, 5' primers: Sequences Abλ5'1-9[SEQ ID NO: 29-37] 3' primers: Sequences Abλ3'1-2 [SEQ ID NO: 40-41] κchain Fv, 5' primers: Sequences Vκ5'1-4 [SEQ ID NO: 42-45] 3' primers:Sequences Vκ3'1-4 [SEQ ID NO: 46-49] Full length 5' primers: SequencesAbλ5'1-4 [SEQ ID NO: 42-45] 3' primers: Sequences Abλ3'1 [SEQ ID NO: 50]Light Chain Primers for Cloning into a Site Upstream of GAL-4 AD: λchain Fv, 5' primers: Sequences Vλ5'1'-9' [SEQ ID NO: 51-59] 3' primers:Sequences Vλ3'1'-2' [SEQ ID NO: 60-61] Full length 5' primers: SequencesAbλ5'1'-9' [SEQ ID NO: 51-59] 3' primers: Sequences Abλ3'1'-2' [SEQ IDNO: 62-63] κ chain Fv, 5' primers: Sequences Vκ5'1'-4' [SEQ ID NO:64-67] 3' primers: Sequences Vκ3'1'-4' [SEQ ID NO: 68-71] Full length,5' primers: Sequences Abλ5'1'-4' [SEQ ID NO: 64-67] 3' primers:Sequences Abλ3'1' [SEQ ID NO: 72]

[0401] Each of the heavy chain 5′-primers, which are the same for Fv,Fab and full length Ab, contains a Not I restriction site. Each of theheavy chain 3′-primer contains both Sac II and Sal I restriction sites.By using these primer sets for heavy chain regions listed in Table 2,the heavy chain library can be generated by PCR amplification of humanantibody library to incorporate restriction sites at the 5′ and 3′ ends.This library can be cleaved by restriction digestion and directionallycloned into a yeast expression vector, such as a modified pACT2 vector,pACT-DC.

[0402] Each of the λ and κ light chain 5′-primers which are the same forFv and full length Ab contains a 60-bp flanking sequence (underlined)that is designed to be homologous to a section at the 5′ terminus of alinearized pACT2 or pACT-DC. Each of the the λ and κ light chain5′-primers contains a 60-bp flanking sequence (underlined) homologous toa section at the 3′ terminus of the linearizd pACT2 or pACT-DC. Theseprimer sets are used in combination to amplify the light-chain regionsof the human antibody gene pool from the cDNA library. The resulting PCRfragments can be used for subsequent insertion into the pACT2 or pACT-DCvector via homologous recombination. The plasmid map of vector pACT2 isshown in FIG. 12A.

[0403] Each flanking sequence added to the primary PCR product is 60 bpin length. The design of the flanking sequence of primer is such thatthe reading frame of the light chain sequences are conserved withupstream GAL 4 reading frame that is encoded by the cloning vector.Depending on the cloning vector used in the next step, additionalfeatures such as epitope tags (for detection and purification) andunique restriction enzyme recognition sites (for subcloning) can also beintegrated at this step by primer design.

[0404] The amplified heavy chain library can be directionally clonedinto a modified pACT2 vector which is described below in bacteria.Subsequently, the amplified light chain library can be cloned into thisvector in yeast via homologous recombination by following the schemeddepicted in FIG. 3.

[0405] The PCR reaction is done in the volume of 50 ul containing 5 ulof the cDNA synthesized from step 2, 20 pmol concentration of the mixed5′ and 3′ primers, 250 uM dNTPs, 10 mM KCl, 10 mM (NH4)₂SO₄, 20 mMTris.HCl (pH 8.8), 2.0 mM MgCl2, 100 mg/ml BSA, and 1 ul (1 unit) ofAdvanTaq® DNA polymerase (Clontech, Calif.). The reaction mixture issubjected to 30 cycles of amplification using a Perkin-Elmer thermalcycler. The cycle is 94° C. for 1 min (denaturation), 57° C. for 1 min(annealing), and 72° C. for 2.5 min (extension). Vλ and Vκ chain PCRproducts are pooled together at this stage. The PCR products are checkedby electrophoresis and purified from 1.0% agarose gel using Qiaxaffinity matrix (Qiagen, Calif.) and resuspended in 25 ul of H₂O.

[0406] 3) Directional Cloning of Heavy Chain Library into a Two-HybridAD Vector in Bacteria

[0407] The PCR fragments of the antibody heavy chain cDNA gene poolgenerated above are cloned into a modified pACT2 vector by directionalcloning in bacteria.

[0408] The original pACT2 plasmid (FIG. 12A) is modified byincorporating an expression cassette derived from the pBridge plasmid(FIG. 12B). Since pACT2 has two Bgl 2 sites flanking the originalmultiple cloning site (MCS), the original MCS-II in pBridge thatincludes Not I and Bgl 2 needs to be modified. Two oligonucleotides(Sequences A1 and A2, SEQ ID NO: 73-74) with phosphate groups at their5′ ends are synthesized and annealed to each other. This annealeddouble-stranded DNA is ligated into the Bgl 2 site after pBridge plasmidis digested with Bgl 2 and dephosphorylated. Such modification resultsin a new vector (pBridge-1) that contains 3 new restriction sites (i.e.Sac 2, Pvu 2 and Sal I), but lacks Bgl 2 site. [SEQ ID NO. 73]       Sac2  Pvu2  Sal I Sequence A1 5′-pGATCCGCGGCAGCTGTCGAC-3′ [SEQ IDNO. 74] Sequence A2 3′-GCGCCGTCGACAGCTGCTAGp-5′

[0409] The expression cassette in pBridge-1 contains the MET25 promoter(P_(MET)25) followed by a nuclear localization signal (NLS), a HA-tagand a MCS (designated MCS-II), and the PGK terminator (T_(PGK)). Thefollowing oligos (Sequences A3 and A4, SEQ ID NO: 75-76) are used asprimers to amplify the cassette (˜1 kb) from pBridge-1 by PCR. SequenceA3: oligo corresponding to the 5′ end of (P_(MET25))        Xho I [SEQID NO: 75] 5′-ACTCGAGCTTCTAATTCTTCCAACATAC Sequence A4: oligocomplementing to the 3′ end of (T_(PGK))         Xho I [SEQ ID NO: 76]5′-ACTCGAGAACGCAGAATTTTCGAGTTATT

[0410] The cassette is then cloned into a cloning vector pGEM-T, and itssequence is confirmed by standard DNA sequencing methods.

[0411]FIG. 12C depicts the overall process of modifying pACT2 togenerate pACT-DC, a yeast expression vector having double expressioncassettes. Briefly, the vector pACT2 is digested with Not I enzyme, andtreated with Klenow fragment of E. coli DNA polymerase I in the presenceof dCTP and dGTP. The vector is then self-ligated to produce a plasmidpACT3 that lacks Not I site and lox P site. The plasmid pACT3 is furtherdigested with Spe I and treated with Klenow fragment in the presence ofdNTP's. The vector is then self-ligated, resulting in pACT4 that doesnot contain Spe I.

[0412] Another MCS (designated MCS-III) is added to pACT4 upstream ofthe GAL4-AD in pACT4. This is done by PCR using primers (sequences A5 &A6, SEQ ID NO: 77-78) with restriction sites added in the primers. Asdepicted in FIG. 12C, The PCR product is digested with Spe I andself-ligated. There are five restriction sites added (SgrA I, Apa I, SpeI, Sph I and BssH2, in order) between T antigen NLS and Gal4-AD domain.Since ten codons are added, the Gal4-AD is still in-frame. Its sequenceis confirmed by standard DNA sequencing methods. The resulting vector isdesignated pACT5. [SEQ ID NO: 77] Sequence A5:        Spe I  Sph I BssH25′-ATATGACTAGTGGCATGCGCGCCAATTTTAATCAAAGTGGG [SEQ ID NO: 78] SequenceA6:        Spe I  ApaI SgrA I5′-ATATGACTAGTGGGCCCACCGGTGGCGGTACCCAATTCGACCTT

[0413] The expression cassette derived from pBridge is retrieved fromPGEM-T by digestion with Xho I. As depicted in FIG. 12C, the DNAfragment is then ligated into pACT5 that has been digested with Sal Iand dephosphorylated. This ligation destries both Xho I and Sal I sites.The resulting plasmid will be confirmed by restriction digestions. Thisvector contains two different expression cassettes and is designatedpACT-DC. Table 3 lists the oligonucleotides used to modify pACT2 toproduce pACT-DC.

[0414] The library of expression vectors containing human antibody heavychain library is constructed by directional cloning in bacteria. Theheavy chain library amplified from human antibody gene pool (describedin Section 2) of this Example) are cloned into MCS-II of pACT-DC at NotI site at the 5′ end and Sac II or Sal I at the 3′ end, such thatexpression of the heavy chain library is under the control of thepromoter P_(met25).

[0415] In order to avoid internal cutting of the heavy chain library bySac II or Sal I, the PCR amplified heavy chain library is divided intotwo portions. The first portion is digested with Not I and Sac II, andthen ligated into pACT2-DC digested with Not I and Sac II. The secondportion is digested with Not I and Sal I, and then ligated into pACT2-DCdigested with Not I and Sal I.

[0416] The ligated products are transformed into E. coli cells. Care istaken not to have high level of empty vector in the product. Platingdensity of the library is preferred to have no more than 0.2×10⁴colonies per 150 mm diameter plate.

[0417] Colonies of E.coli transformants are collected and used forplasmid preparation directly. Total volume of the E.coli coloniesscraped from the plates should be sufficient for a plasmid prep atmaxi-level. The total library DNA prepared is subjected to qualitycontrol analysis by using these tests: 1) determination of percentage ofplasmid containing inserts (preferred to be above 95%); 2) verificationof Fv, Fab, or full length heavy chain sequences; 3) determination ofread through ability of the junction region sequence; and 4)determination of percentage of non-identical insert sequences from 2-3dozens of clones. The complexity of the heavy chain library is preferredto be about 10⁴-10⁵.

[0418] 4) Cloning of Light Chain Library into PACT-DC via HomologousRecombination in Yeast

[0419] The library of expression vectors containing both heavy chain andlight chain libraries under transcriptional control of differentpromoters is constructed through homologous recombination in yeast. Thelight chain library (including both λ and κ light chain) amplified fromhuman antibody gene pool (described in Section 2 of this Example) arecloned into MCS (located downstream of GAL-4 AD) or MCS-III (locatedupstream of GAL-4 AD) of pACT-DC, such that expression of the heavychain library is under the control of the promoter P_(ADH1).

[0420] The library of pACT-DC containing the heavy chain library islinearized with restriction enzymes digestion (e.g. BamH I, Xho I,preferably Sfi I) in the multiple cloning site (MCS). This is done in 20ul volume containing the following reagents: 10 μg of vector DNA, 1-2 ulof restriction enzyme Sfi I, 2 ul of 10× buffer. Digestion is carriedout at 37° C. overnight. The completion of the enzyme digestion ischecked by electrophoresis. No further modification or purification oflinearized vector is necessary.

[0421] Alternatively, the library of light chain fragments can be clonedinto MCS-III site of the pACT-DC, such that the light chain expressed isfused with the N-terminus of GAL-4 AD, i.e. upstream of GAL-4 AD.

[0422] The linearized vector DNA (10 μg) is mixed with equal amount ofthe PCR amplified light chain fragments (described in Section 2 of thisExample), preferably at about 5-10 molar excess of the insert fragment).The linearized vector DNA and the PCR fragments are co-transformed intocompetent yeast strain Y187 (α mating type, from Clontech).

[0423] Transformation is performed as the following. Yeast competentcells are prepared by LiAc protocol (Gietz et al. (1992) “Improvedmethod for high efficiency transformation of intact yeast cells” NucleicAcids Res. 20:1425), or obtained from a commercial source (LifeTechnology Inc., MD). Minimum yeast competency of 106 transformant/ugDNA may be required for library construction. Yeast competent cellsderived from 1 liter culture of OD₆₀₀=0.2 are used for eachtransformation in 50 ml conical bottom tubes. Yeast cells are thawed at4° C., washed with de-ionized water and resuspended in 8 ml of 1×TE/LiAc (1× TE/LiAc is made up of 40% polyethylene glycol 4000, 10 mMTris-HCl, 1 mM EDTA, pH 7.5, and 0.1 M lithium acetate). The mixture ofDNA containing the linearized vector and PCR amplified inserts withextended ends is added to the tube and vortexed to mix. The tube isincubated at 30° C. for 30 min, with shaking (200 rpm). DMSO (Dimethylsulfoxide, 700 ul) is added into the tube and mixed gently. The cells inthe tube are heat shocked at 42° C. in a water bath for 15 minutes withoccasional swirl. After the heat shock, the cells are pelleted by abrief centrifugation at 4° C. and washed one or two time with water. Thecells are resuspended in 1.5 ml of 1× TBE buffer.

[0424] Yeast cells are plated into plates made up of selection medium.For Y187 strain of yeast, the SD/-Leu medium is used. Harper et al.(1993), supra. The library scale transformation requires approximately100 large plates of 150 mm in diameter. Y187 transformed with eitherlinearized vector without insert DNA fragment or vise versa is alsoplated onto the same selection plates as controls. Y187 transformed withunlinearized vector pACT2 is used as transformation efficiency controland is plated with series dilutions. The plates are incubated bottom upat 30° C. for 3 days or more. Colony number is examined and recorded. Ifthe yeast control transformation with unlinearized pACT2 yields at least1 million transformants, as expected, 10 millions of single chainlibrary recombinant clones are expected to obtain from each suchtransformation. Any control transformation with either the linearizedvector or insert DNA fragment alone is expected to yield only {fraction(1/10)} or less number of colonies as compared with the combinedvector/insert transformation. This single step of transformation isrepeated until 100 million or more independent clones are obtained.

[0425] The yeast library recombinant colonies generated as describedabove are scraped from the final culture plates after growing for 5-7days. The majority of the yeasts are mixed with 50% (volume) of glyceroland stored at −80° C. for future library screening use. A small fractionof the yeast clones are subjected to the following quality analyses:

[0426] a. Percentage of recombinant clones: PCR amplification of thelight chain insert directly from yeast with a primer pair matched withflanking vector sequences (e.g., Long PCR primer pair for AD vectorssupplied by Clontech) should reveal how many clones are recombinant.Since our design of extended homologous regions for recombinationbetween the insert and cloning vector is sufficient long (about 60 bp),a high percentage of recombinant clone (higher than 95%) should beexpected. Libraries with minimum of 90% recombinant clones arepreferably to be saved for screening use.

[0427] b. Insert size: The same PCR amplification of selected clonesshould reveal the insert size. Although a small fraction of the librarymay contain double or other forms of multiple inserts, the majority(>95%) should have single insert with expected size.

[0428] c. Fingerprinting verification of sequence diversity: PCRamplification product with the correct size is fingerprinted withfrequent digesting restriction enzymes, such as Bst NI or any other 3-4base cutters. From the agarose gel electrophoresis pattern, one candetermine whether clones analyzed are of the same identity or of thedistinct or diversified identity. The PCR products can also be sequenceddirectly. This will reveal the identity of inserts and the fidelity ofthe cloning procedure, and will prove the independence and diversity ofthe clones. If 100 clones are sequenced, it should be expected that onlysmall fraction (<5%) of clones will have multiple isolates.

[0429] 5) Alternative Design: Cloning of Light Chain Library into aYeast Surface Display Vector (pYD1) via Homologous Recombination inYeast

[0430] A library of yeast surface display vectors for expressing anantibody library can be constructed by following similar protocols asdescribed above for construction of the library of two-hybrid expressionvectors. Briefly, a yeast surface display vector, pYD1 (available fromInvitrogen, San Diego, Calif.; Boder and Wittrup (1997) Nature Biotech.15: 553-557), is used as the expression vector for expressing the cDNAlibrary of human antibody described in Section 1 of this Example. Thevector map of pYD1 is shown in FIG. 12D. As shown in FIG. 12D, thevector pYD1 encodes Aga2 subunit of the yeast cell wall protein,a-agglutinin (or α-agglutinin). Aga2 subunit forms a-agglutinin byinteracting with Aga1 subunit of a-agglutinin through disulfide bonding.The protein complex formed between Aga1 and Aga2 subunits binds to thea-agglutinin yeast adhesion receptor on the yeast cell wall, thus beingdisplayed on the surface of yeast cells.

[0431] Using a protocol similar to that for modifying pACT-2 vector,pYD1 is modified to include the MET25 expression cassette from pBridgevector. The modified pYD1 is designated pYD1-DC. PCR fragments of theantibody heavy chain cDNA gene pool are cloned into a site downstream ofthe P_(MET25) promoter of pYD1-DC through directional cloning inbacteria. The light chain library (including λ and κ light chain)amplified from human antibody gene pool are cloned into the MCS sitedownstream of Aga2 domain through homologous recombination in yeast. Thelight chain is thus expressed as a fusion protein with Aga2. The libraryof yeast surface display vectors encoding human antibody library aretransformed into S. cervisiae cells. The antibody formed between theheavy chain and Aga2-light chain fusion is displayed on the surface ofthe yeast cells through association of Aga1/Aga2 complex with thea-agglutinin yeast adhesion receptor on the yeast cell wall. Thislibrary of human antibodies displayed on yeast cell surface is screenedagainst a fluorescence-labeled target molecule. Those cells displayingantibodies that bind to the target molecule are selected by FACS.

Example 2

[0432] Screening of Antibody Libraries in Yeast with the Two-HybridSystem against Defined Protein Antigens via Mating between Two YeastStrains

[0433] This example describes a procedure used to screen the antibodylibraries generated in the Example 1. The human antibody libraries aregenerated in yeast strain with an α mating type. This mating type ofyeast can be readily mated with an a type of yeast with simple matingprocedure to form diploid yeast cells. Guthrie and Fink (1991) “Guide toyeast genetics and molecular biology” in Methods in Enzymology (AcademicPress, San Diego) 194:1-932. The a-yeast contains the target (probe, orbait) plasmid.

[0434] The target plasmid contains a fusion formed between the GAL 4 DNAbinding domain (BD) and any desired target protein that is to be used asa probe to fish out the antibodies as its affinity ligand. When the twotypes of yeast cell mate and form diploid cells, the probe plasmid andthe library clone plasmid also come together in a same cell. Therefore,if a specific antibody clone recognizes and binds to the probe protein,each of these proteins or protein fragments should bring their fusionpartners (GAL 4 AD and GAL 4 BD) to a close proximity in the promoterregion of reporter(s). Under such a circumstance, the reporter(s)construct built in the yeast cells (the parental a- and/or α-type ofhaploid cells) should be activated by the active GAL 4 proteins. Thusthe reporter is expressed and a positive signal in the library screen isdetected. Certain reporter(s) are of nutritional reporter, which allowsthe yeast to grow on a specific selection medium plate.

[0435] In practice, equal volume of bait-containing yeast strain(a-type, e.g. AH109 strain) and the antibody library-containing yeaststain (α-type, e.g. Y187 strain) are inoculated into selection liquidmedium and incubated with rigorous shaking at 30° C. for 20 hours. Thesecultures are then mixed in a single flask and allowed to grow in richmedium 1× YPD (20 g/l Difco peptone, 10 g/l yeast extract, and 2%glucose) for 12-16 additional hours with slow shaking at 30° C. Underthe rich nutritional culture condition, the two haploid yeast strainsencounter and mate to form diploid cells. At the end of this matingprocess, a good fraction—5-10% of the yeast population present in themating pool will form diploids. Bendixen, C., Gangloff, S., andRothstein, R. (1994) “A yeast mating-selection scheme for detection ofprotein-protein interactions” Nucleic Acids Res. 22:1778-1779.

[0436] After mating, the yeast cells are washed with H₂O several timesand plated into selection plates by using the SD/-Leu-Trp-His-Adeselections. The first two selections are for selection markers (Leu andTrp) expressed from the vectors and are for retaining both BD and ADvectors in the same yeast cells. The selected cells should be diploidcells, since either haploid cell only expresses one of these markers.The latter two markers are expressed by the reporter from the hoststrains and are for selection of clones that show positive interactionbetween the members of the antibody library and the target protein.

Example 3

[0437] Screening of Human Antibody Libraries against a Library ofAntigens in a Yeast Two-Hybrid System.

[0438] For small number of pre-selected probes (i.e. baits or targets),the procedure of individual mating screening as described above issufficient. However, this procedure can also be modified to suit forscreening against large number of probes. The following list describesthe potential probes that are in large number and may not suitable forindividual mating screening:

[0439] a. A collection of human EST clones, or total library of humanEST. Such EST collection can be ordered from public resource in alibrary format with individually clones arrayed in 96-well or 384-wellplates. The EST inserts from the original collection (usually inbacterial cloning and sequencing vectors) are PCR amplified withextended homologous sequences at both ends. The EST inserts can be PCRamplified and additional flanking sequences can be added to both ends ofthe ESTs by PCR for mediating homologous recombination in yeast. Thenthrough the same homologous recombination procedure describe in Section4) of Example 2, the EST insert can be cloned into the AD vector. Amaximum of three homologous recombination events should be sufficientfor the read-through fusion of each EST with the GAL4 AD. Hua, S. B. etal. (1998) “Construction of a modular human EST-derived yeast two-hybridcDNA library for the human genome protein linkage map” Gene 215:143-152.

[0440] b. A collection of certain domain structures, such as zinc fingerprotein domains each having 18-20 amino acids. These domain structuresmay not be completely random. Synthetic oligonucleotides withcharacteristic conserved and random/degenerate residues can be made tocover most of the rational domain structures;

[0441] c. A completely random peptide library each having 16-20 aminoacid residues. Such a library can also be made by random oligonucleotidesynthesis. Such library has been constructed in an AD vector. Yang, M.et al. “(1995) “Protein-protein interactions analyzed with the yeasttwo-hybrid system” Nucleic Acids Res. 23:1152-1157. Such a library ofprobes can also be built in an BD vector. Each clone of such libraryrepresents a short peptide. The human antibody library (built in ADvector) is screened against this library of probes, peptide ligands foreach antibody can be selected. Such peptides may have potentialapplications in rational design and structural improvement of antigens.

[0442] The library of probes are cloned into a DB vector and each isfused with GAL4 DB domain. This library are made as an arrayed clonelibrary by depositing every clone obtained with BD-probe fusion into awell in 96 or 384 well plates. This arrayed format facilitates largescale library screening with machine-aided automation.

[0443] Prior to using the library of probes to screen against the humanantibody library, the library of probes are transformed into yeasta-type of host strain to select out any self-activating clones. Thispre-selection is to allow the yeast harboring only the probe plasmids togrow in a selection medium (SD/-Trp-His) and check for activationwithout the AD mating partner, the so-called self activation.

[0444] Alternatively, the pre-selection is conducted in selection mediumwith α- or β-galactosidase substrate. Any positive clones will produce acolored reaction and can be easily detected by naked eye or byinstrument. The clone that send out positive signals indicatingactivation of the reporter gene(s) are self-activating clones which areexcluded from the subsequent use as the targets for the antibodylibrary.

[0445] The machine-aided automatic screening is performed by using 96-or 384-well plates. The target clones of a-strain are sequentiallyinoculated into a plate which is pre-seeded with an arrayed library ofthe antibody library of α-strain. The two haploid yeast strains mate inthe rich medium and form diploid. The wells sending positive signals ofreporter gene expression are detected. The screening process is similarto the individual target screening against a library in the mixedculture as described in Example 3. The difference in this case is thatclonal mating (a mating between an individual target against anindividual antibody) is performed here to enhance the efficiency whenlarge numbers of targets and human antibodies are involved.

Example 4

[0446] Maturation of Primary Antibody Isolates by Random Mutagenesis invitro and Re-Screening in vivo in a Yeast Two-Hybrid System

[0447] The antibody clones isolated from in Examples 3-4 can be ofvarious degree of affinity. Although high affinity clones may beobtained with a low marginal possibility, the majority of the clones mayneed further modification to reach affinity compatible with naturalantibodies (dissociation constant at 10⁻⁹ M or lower).

[0448] In this example, the sequences of primary clones are mutagenizedin vitro to incorporate random mutations into the heavy chain and/orlight chain regions, thereby creating a secondary library of antibodieswith increased complexity. Complexity of the secondary library isexpected to be at 10⁴ or higher. So the combined diversity of primaryand secondary libraries screened should be at 10¹⁴-10¹⁸, no less thanthe natural antibody diversification through selection/maturation in ananimal.

[0449] For example, coding sequences of the light chain regions of theselected antibodies are amplified from the corresponding antibody clonesby PCR. The light chain region resides in the AD vector and is fusedwith GAL-4 AD domain. A pair of PCR primers are used to specificallyamplify the light chain region out of the vector. The pair of primersare designed to match with the regions of the cloning vectors that flankthe light chain genes. These regions contain sequences for homologousrecombination between the cloning vector and the amplified product.

[0450] This primary PCR product is checked by agarose gelelectrophoresis for correct size and amount. An aliquot of the primaryPCR product is then subjected to a secondary PCR. This secondary PCR isdesigned to incorporate mutations into the product under theseconditions: high concentration of Mn and over-proportionaly highconcentration of one nucleotide substrate in the PCR reaction in the PCRreaction. Mn²⁺ at a concentration of between 0.4 and 0.6 mM canefficiently cause Taq polymerase to incorporate mutations into the PCRproduct. This mis-incorporation is caused by the malfunction of Taq DNApolymerase. Single nucleotide (e.g., dGTP) at an extra higherconcentration than the other 3 essential nucleotides (dATP, dTTP, anddCTP) causes the incorrect incorporation of this high concentrationsubstrate into the template and produce mutations.

[0451] Besides the two conditions listed above, other condition mayinfluence the rate of mis-incorporation of “wrong” nucleotide into thePCR product, including the number of PCR cycles, the species of DNApolymerase used, and the length of the template. In this example, apre-made kit is used (Diversity PCR Random Mutagenesis Kit,Cat.#K1830-1, Clontech, Palo Alto, Calif.). This kit contains reagentsnecessary for optimizing the conditions for random mutation by PCR, suchas dNTP Mix and additional dGTP solution, Manganese Sulfate, and controlPCR template and primer mix.

[0452] As suggested by the user manual for this kit, the followingcondition is used for PCR mutagenesis: 640 uM MnSO₄, 200 uM dGTP. Underthis condition, an average of 8 mutations is expected to be found inevery 1000 bp, a rate that is sufficient for scFv diversification.

[0453] This secondary antibody library is reintroduced into yeastthrough homologous recombination and screened directly in yeastfollowing similar procedures as in the primary screening described inExample 2 and Example 3, respectively. This whole process mimics thenaturally occurring affinity maturation process that higher organismsincluding human are inherited.

1 80 1 34 DNA Artificial Sequence LoxP WT 1 ataacttcgt ataatgtatgctatacgaag ttat 34 2 34 DNA Artificial Sequence LoxP511 2 ataacttcgtatagtataca ttatacgaag ttat 34 3 34 DNA Artificial Sequence LoxC2 3acaacttcgt ataatgtatg ctatacgaag ttat 34 4 34 DNA Artificial SequenceLoxP1 4 ataacttcgt ataatatatg ctatacgaag ttat 34 5 34 DNA ArtificialSequence LoxP2 5 ataacttcgt atagcataca ttatacgaag ttat 34 6 34 DNAArtificial Sequence LoxP3 6 ataacttcgt ataatgtata ctatacgaag ttat 34 733 DNA Artificial Sequence LoxP4 7 ataacttcgt ataatataaa ctatacgaag tta33 8 34 DNA Artificial Sequence LoxP5 8 ataacttcgt ataatctaac ctatacgaagttat 34 9 34 DNA Artificial Sequence LoxP6 9 ataacttcgt ataacatagcctatacgaag ttat 34 10 34 DNA Artificial Sequence LoxP7 10 ataacttcgtataacatacc ctatacgaag ttat 34 11 34 DNA Artificial Sequence LoxP8 11attacctcgt atagcataca ttatacgaag ttat 34 12 34 DNA Artificial SequenceLoxP9 12 ataacttcgt atagcataca ttatatgaag ttat 34 13 34 DNA ArtificialSequence LoxP10 13 attacctcgt atagcataca ttatatgaag ttat 34 14 46 DNAArtificial sequence PCR primer 14 accaaggaaa aacaagcggc cgcacaggtgcagctgcagg agtcsg 46 15 45 DNA Artificial sequence PCR primer 15accaaggaaa aacaagcggc cgcacaggta cagctgcagc agtca 45 16 46 DNAArtificial sequence PCR primer 16 accaaggaaa aacaagcggc cgcacaggtgcagctacagc agtggg 46 17 45 DNA Artificial sequence PCR primer 17accaaggaaa aacaagcggc cgcagaggtg cagctgktgg agwcy 45 18 47 DNAArtificial sequence PCR primer 18 accaaggaaa aacaagcggc cgcacaggtccagctkgtrc agtctgg 47 19 46 DNA Artificial sequence PCR primer 19accaaggaaa aacaagcggc cgcacagrtc accttgaagg agtctg 46 20 47 DNAArtificial sequence PCR primer 20 accaaggaaa aacaagcggc cgcacaggtgcagctggtgs artctgg 47 21 40 DNA Artificial sequence PCR primer 21atccaccgcg gtcgactatg aggagacrgt gaccagggtg 40 22 40 DNA Artificialsequence PCR primer 22 atccaccgcg gtcgactatg aggagacggt gaccagggtt 40 2339 DNA Artificial sequence PCR primer 23 atccaccgcg gtcgactatgaagagacggt gaccattgt 39 24 41 DNA Artificial sequence PCR primer 24atccaccgcg gtcgactatg aggagacggt gaccgtggtc c 41 25 38 DNA Artificialsequence PCR primer 25 atccaccgcg gtcgactagg ttggggcgga tgcactcc 38 2639 DNA Artificial sequence PCR primer 26 atccaccgcg gtcgactasgatgggccctt ggtggargc 39 27 48 DNA Artificial sequence PCR primer 27atccaccgcg gtcgactaac atggtttgvr ctcaactbtc ttgtccac 48 28 42 DNAArtificial sequence PCR primer 28 atccaccgcg gtcgactatt tacccrgagacagggagagg ct 42 29 83 DNA Artificial sequence PCR primer 29 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 cagtctgtsbtgacgcagcc gcc 83 30 82 DNA Artificial sequence PCR primer 30 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 tcctatgwgctgacwcagcc ac 82 31 83 DNA Artificial sequence PCR primer 31 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 tcctatgagctgayrcagcy acc 83 32 80 DNA Artificial sequence PCR primer 32 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 cagcctgtgctgactcaryc 80 33 83 DNA Artificial sequence PCR primer 33 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 cagdctgtggtgacycagga gcc 83 34 83 DNA Artificial sequence PCR primer 34 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 cagccwgkgctgactcagcc mcc 83 35 83 DNA Artificial sequence PCR primer 35 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 tcctctgagctgastcagga scc 83 36 81 DNA Artificial sequence PCR primer 36 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 cagtctgyyctgaytcagcc t 81 37 82 DNA Artificial sequence PCR primer 37 ccaccaaacccaaaaaaaga gatctgtatg gcttacccat acgatgttcc agattacgct 60 aattttatgctgactcagcc cc 82 38 80 DNA Artificial sequence PCR primer 38 gagatggtgcacgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60 taggacggtsascttggtcc 80 39 80 DNA Artificial sequence PCR primer 39 gagatggtgcacgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60 gaggacggtcagctgggtgc 80 40 87 DNA Artificial sequence PCR primer 40 gagatggtgcacgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60 ttatgaacattctgcagggg cmactgt 87 41 87 DNA Artificial sequence PCR primer 41gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60ttaagagcat tctgcagggg ccactgt 87 42 83 DNA Artificial sequence PCRprimer 42 ccaccaaacc caaaaaaaga gatctgtatg gcttacccat acgatgttccagattacgct 60 gacatccrgd tgacccagtc tcc 83 43 83 DNA Artificial sequencePCR primer 43 ccaccaaacc caaaaaaaga gatctgtatg gcttacccat acgatgttccagattacgct 60 gaaattgtrw tgacrcagtc tcc 83 44 83 DNA Artificial sequencePCR primer 44 ccaccaaacc caaaaaaaga gatctgtatg gcttacccat acgatgttccagattacgct 60 gatattgtgm tgacbcagwc tcc 83 45 82 DNA Artificial sequencePCR primer 45 ccaccaaacc caaaaaaaga gatctgtatg gcttacccat acgatgttccagattacgct 60 gaaacgacac tcacgcagtc tc 82 46 80 DNA Artificial sequencePCR primer 46 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagtatctacgattc 60 tttgatttcc accttggtcc 80 47 80 DNA Artificial sequence PCRprimer 47 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagtatctacgattc 60 tttgatctcc ascttggtcc 80 48 80 DNA Artificial sequence PCRprimer 48 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagtatctacgattc 60 tttgatatcc actttggtcc 80 49 80 DNA Artificial sequence PCRprimer 49 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagtatctacgattc 60 tttaatctcc agtcgtgtcc 80 50 84 DNA Artificial sequence PCRprimer 50 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagtatctacgattc 60 ctagcactct cccctgttga agct 84 51 86 DNA Artificialsequence PCR primer 51 gataaagcgg aattaattcc cgagcctcca aaaaagaagagaaaggtcga attgggtacc 60 gcccagtctg tsbtgacgca gccgcc 86 52 85 DNAArtificial sequence PCR primer 52 gataaagcgg aattaattcc cgagcctccaaaaaagaaga gaaaggtcga attgggtacc 60 gcctcctatg wgctgacwca gccac 85 53 86DNA Artificial sequence PCR primer 53 gataaagcgg aattaattcc cgagcctccaaaaaagaaga gaaaggtcga attgggtacc 60 gcctcctatg agctgayrca gcyacc 86 5483 DNA Artificial sequence PCR primer 54 gataaagcgg aattaattcccgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60 gcccagcctg tgctgactca ryc83 55 86 DNA Artificial sequence PCR primer 55 gataaagcgg aattaattcccgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60 gcccagdctg tggtgacycaggagcc 86 56 86 DNA Artificial sequence PCR primer 56 gataaagcggaattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60 gcccagccwgkgctgactca gccmcc 86 57 86 DNA Artificial sequence PCR primer 57gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gcctcctctg agctgastca ggascc 86 58 84 DNA Artificial sequence PCR primer58 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gcccagtctg yyctgaytca gcct 84 59 85 DNA Artificial sequence PCR primer59 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gccaatttta tgctgactca gcccc 85 60 80 DNA Artificial sequence PCR primer60 gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60taggacggts ascttggtcc 80 61 80 DNA Artificial sequence PCR primer 61gagatggtgc acgatgcaca gttgaagtga acttgcgggg tttttcagta tctacgattc 60gaggacggtc agctgggtgc 80 62 84 DNA Artificial sequence PCR primer 62gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60tgaacattct gcaggggcma ctgt 84 63 84 DNA Artificial sequence PCR primer63 gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60agagcattct gcaggggcca ctgt 84 64 86 DNA Artificial sequence PCR primer64 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gccgacatcc rgdtgaccca gtctcc 86 65 86 DNA Artificial sequence PCR primer65 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gccgaaattg trwtgacrca gtctcc 86 66 86 DNA Artificial sequence PCR primer66 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gccgatattg tgmtgacbca gwctcc 86 67 85 DNA Artificial sequence PCR primer67 gataaagcgg aattaattcc cgagcctcca aaaaagaaga gaaaggtcga attgggtacc 60gccgaaacga cactcacgca gtctc 85 68 80 DNA Artificial sequence PCR primer68 gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60tttgatttcc accttggtcc 80 69 80 DNA Artificial sequence PCR primer 69gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60tttgatctcc ascttggtcc 80 70 80 DNA Artificial sequence PCR primer 70gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60tttgatatcc actttggtcc 80 71 80 DNA Artificial sequence PCR primer 71gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60tttaatctcc agtcgtgtcc 80 72 81 DNA Artificial sequence PCR primer 72gttagtgaaa gtgaaggaca atgagctatc agcaatattc ccactttgat taaaattggc 60gcactctccc ctgttgaagc t 81 73 20 DNA Artificial sequence Oligo formutation 73 gatccgcggc agctgtcgac 20 74 20 DNA Artificial sequence Oligofor mutation 74 gtacgtcgac agctgccgcg 20 75 28 DNA Artificial sequencePCR primer 75 actcgagctt ctaattcttc caacatac 28 76 29 DNA Artificialsequence PCR primer 76 actcgagaac gcagaatttt cgagttatt 29 77 41 DNAArtificial sequence PCR primer 77 atatgactag tggcatgcgc gccaattttaatcaaagtgg g 41 78 44 DNA Artificial sequence PCR primer 78 atatgactagtgggcccacc ggtggcggta cccaattcga cctt 44 79 24 PRT Artificial sequencesemi-rigid linker 79 Pro Gln Pro Gln Pro Lys Pro Gln Pro Gln Pro Gln ProGln Pro Lys 1 5 10 15 Pro Gln Pro Lys Pro Glu Pro Glu 20 80 15 PRTArtificial sequence Linker 80 Gly Gly Gly Gly Ser Gly Gly Gly Gly SerGly Gly Gly Gly Ser 1 5 10 15

What is claimed is:
 1. A method for selecting tester protein complexescapable of binding to a target peptide or protein, the methodcomprising: expressing a library of tester protein complexes in yeastcells, each tester protein complex being formed between a firstpolypeptide subunit whose sequence varies within the library and asecond polypeptide subunit which is expressed as a separate protein fromthe first polypeptide subunit and whose sequence varies within thelibrary independently of the first polypeptide; expressing a targetfusion protein in the yeast cells expressing the tester proteincomplexes, the target fusion protein comprising a target peptide orprotein; and selecting those yeast cells in which a reporter gene isexpressed, the expression of the reporter gene being activated bybinding of the tester protein complex to the target fusion protein. 2.The method of claim 1, wherein expressing the library of tester proteincomplexes includes transforming a library of tester expression vectorsinto the yeast cells which contain a reporter construct comprising thereporter gene whose expression is under transcriptional control of atranscription activator comprising an activation domain and a DNAbinding domain, each tester expression vector comprising a firsttranscription sequence encoding either the activation domain or the DNAbinding domain of the transcription activator, a first nucleotidesequence encoding the first polypeptide subunit fused which is expressedas a fusion protein with either the activation domain or the DNA bindingdomain of the transcription activator, and a second nucleotide sequenceencoding the second polypeptide subunit which is expressed as a separateprotein from the first polypeptide subunit.
 3. The method of claim 2,wherein expressing a target fusion protein includes transforming atarget expression vector into the yeast cells simultaneously orsequentially with the library of tester expression vectors, the targetexpression vector comprising a second transcription sequence encodingeither the activation domain or the DNA binding domain of thetranscription activator which is not expressed by the library of testerexpression vectors; and a target sequence encoding the target protein orpeptide; and expressing the target fusion protein from the targetexpression vector.
 4. The method of claim 1, wherein the steps ofexpressing the library of tester protein complexes and expressing thetarget fusion protein include causing mating between first and secondpopulations of haploid yeast cells of opposite mating types, wherein thefirst population of haploid yeast cells comprises a library of testerexpression vectors for the library of tester fusion proteins, eachtester expression vector comprising a first transcription sequenceencoding either the activation domain or the DNA binding domain of thetranscription activator, a first nucleotide sequence encoding the firstpolypeptide subunit fused which is expression as a fusion protein witheither the activation domain or the DNA binding domain of thetranscription activator, and a second nucleotide sequence encoding thesecond polypeptide subunit which is expressed as a separate protein fromthe first polypeptide subunit; and the second population of haploidyeast cells comprises a target expression vector comprising a secondtranscription sequence encoding either the activation domain or the DNAbinding domain of the transcription activator which is not expressed bythe library of tester expression vectors, and a target sequence encodingthe target protein or peptide; and either the first or second populationof haploid yeast cells comprises a reporter construct comprising thereporter gene whose expression is under transcriptional control of thetranscription activator.
 5. The method of claim 4, wherein the haploidyeast cells of opposite mating types are α and a type strains of yeast.6. The method of claim 5, wherein the mating between the first andsecond populations of haploid yeast cells of α and a type strains is ina rich nutritional culture medium.
 7. The method of claim 1, wherein thediversity of the protein complexes encoded by the library of yeastexpression vectors is at least 1×10⁷.
 8. The method of claim 1, whereinthe diversity of the protein complexes encoded by the library of yeastexpression vectors is at least 1×10¹⁰.
 9. The method of claim 1, whereinthe diversity of the protein complexes encoded by the library of yeastexpression vectors is at least 1×10¹².
 10. The method of claim 1,wherein the first nucleotide sequence in the library of expressionvectors comprises a coding sequence of an antibody light-chain region,and the second nucleotide sequence comprises a coding sequence of anantibody heavy-chain region.
 11. The method of claim 1, wherein theconformation of the protein complexes expressed by the library ofexpression vectors mimics a conformation of an antibody.
 12. The methodof claim 1, further comprising: isolating the tester expression vectorfrom the selected clones; and mutagenizing the first and secondnucleotide sequences in the isolated tester expression vectors to form alibrary of mutagenized expression vectors.
 13. The method of claim 12,wherein the mutagenesis is selected from the group consisting oferror-prone PCR mutagenesis, site-directed mutagenesis, DNA shufflingand combinations thereof.
 14. The method of claim 1, wherein the targetfusion protein comprises an antigen associated with a disease state. 15.The method of claim 1, wherein the target fusion protein comprises atumor-surface antigen.
 16. The method of claim 1, wherein the targetfusion protein comprises a human growth factor receptor.
 17. The methodof claim 16, wherein the human growth factor is selected from the groupconsisting of epidermal growth factors, transferrin, insulin-like growthfactor, transforming growth factors, interleukin-1, and interleukin-2.18. The method of claim 1, wherein the protein encoded by the reportergene is selected from the group consisting of β-galactosidase,α-galactosidase, luciferase, β-glucuronidase, chloramphenicol acetyltransferase, secreted embryonic alkaline phosphatase, green fluorescentprotein, enhanced blue fluorescent protein, enhanced yellow fluorescentprotein, and enhanced cyan fluorescent protein.
 19. A method forselecting tester proteins capable of binding to a target peptide orprotein, the method comprising: expressing a library of tester proteincomplexes in yeast cells, each tester protein complex being formed invivo between a first polypeptide subunit whose sequence varies withinthe library and a second polypeptide subunit which is expressed as aseparate protein from the first polypeptide subunit and whose sequencevaries within the library independently of the first polypeptide;expressing a plurality of target fusion proteins in the yeast cellsexpressing the tester proteins, each of the target fusion proteinscomprising a target peptide or protein; and selecting those yeast cellsin which a reporter gene is expressed, the expression of the reportergene being activated by binding of the tester fusion to the targetfusion protein.
 20. The method of claim 19, wherein the steps ofexpressing the library of tester protein complexes and expressing theplurality of the target fusion proteins includes causing mating betweenfirst and second populations of haploid yeast cells of opposite matingtypes, wherein the first population of haploid yeast cells comprises alibrary of tester expression vectors for the library of tester fusionproteins, each tester expression vector comprising a first transcriptionsequence encoding either the activation domain or the DNA binding domainof the transcription activator, a first nucleotide sequence encoding thefirst polypeptide subunit fused which is expression as a fusion proteinwith either the activation domain or the DNA binding domain of thetranscription activator, and a second nucleotide sequence encoding thesecond polypeptide subunit which is expressed as a separate protein fromthe first polypeptide subunit; and the second population of haploidyeast cells comprises a plurality of target expression vectors, each ofthe target expression vector comprising a second transcription sequenceencoding either the activation domain or the DNA binding domain of thetranscription activator which is not expressed by the library of testerexpression vectors, and a target sequence encoding the target protein orpeptide, wherein either the first or second population of haploid yeastcells further comprises a reporter construct comprising the reportergene whose expression is under transcriptional control of thetranscription activator.
 21. The method of claim 20, wherein members ofthe library of tester expression vectors are arrayed as individual yeastclones in one or more multiple-well plates.
 22. The method of claim 20,wherein members of the library of target expression vectors are arrayedas individual yeast clones in one or more multiple-well plates.
 23. Themethod of claim 20, wherein the mating is based on clonal mating inwhich each yeast clone containing members of the tester expressionvectors is mated individually with each of the members of the library oftarget expression vector.